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EXCERPTS FROM THE PREFACES TO THE 
FIRST AND SECOND EDITIONS 


This book is devoted to the presentation of the theory of the electromagnetic and gravitational 
fields, i.e. electrodynamics and general relativity. A complete, logically connected theory of 
the electromagnetic field includes the special theory of relativity, so the latter has been taken 
as the basis of the presentation. As the starting point of the derivation of the fundamental 
relations we take the variational principles, which make possible the attainment of maximum 
generality, unity and simplicity of presentation. 

In accordance with the overall plan of our Course of Theoretical Physics (of which this 
book is a part), we have not considered questions concerning the electrodynamics of continuous 
media, but restricted the discussion to “microscopic electrodynamics”—the electrodynamics 
of point charges in vacuo. 

The reader is assumed to be familiar with electromagnetic phenomena as discussed in 
general physics courses. A knowledge of vector analysis is also necessary. The reader is not 
assumed to have any previous knowledge of tensor analysis, which is presented in parallel 
with the development of the theory of gravitational fields. 


Moscow, December 1939 
Moscow, June 1947 


L. Landau, E. Lifshitz 



PREFACE TO THE FOURTH ENGLISH EDITION 


The first edition of this book appeared more than thirty years ago. In the course of reissues 
over these decades the book has been revised and expanded; its volume has almost doubled 
since the first edition. But at no time has there been any need to change the method proposed 
by Landau for developing the theory, or his style of presentation, whose main feature was 
a striving for clarity and simplicity. I have made every effort to preserve this style in the 
revisions that I have had to make on my own. 

As compared with the preceding edition, the first nine chapters, devoted to electrodynamics, 
have remained almost without changes. The chapters concerning the theory of the gravitational 
field have been revised and expanded. The material in these chapters has increased from 
edition to edition, and it was finally necessary to redistribute and rearrange it. 

I should like to express here my deep gratitude to all of my helpers in this work—too 
many to be enumerated—who, by their comments and advice, helped me to eliminate errors 
and introduce improvements. Without their advice, without the willingness to help which 
has met all my requests, the work to continue the editions of this course would have been 
much more difficult. A special debt of gratitude is due to L. P. Pitaevskii, with whom I have 
constantly discussed all the vexing questions. 

The English translation of the book was done from the last Russian edition, which appeared 
in 1973. No further changes in the book have been made. The 1994 corrected reprint 
includes the changes made by E. M. Lifshitz in the Seventh Russian Edition published in 
1987. 

I should also like to use this occasion to sincerely thank Prof. Hamermesh, who has 
translated this book in all its editions, starting with the first English edition in 1951. The 
success of this book among English-speaking readers is to a large extent the result of his 
labour and careful attention. 


E. M. Lifshitz 


PUBLISHER’S NOTE 

As with the other volumes in the Course of Theoretical Physics, the authors do not, as a rule, 
give references to original papers, but simply name their authors (with dates). Full bibliographic 
references are only given to works which contain matters not fully expounded in the text. 



EDITOR’S PREFACE TO THE 
SEVENTH RUSSIAN EDITION 


E. M. Lifshitz began to prepare a new edition of Teoria Polia in 1985 and continued his 
work on it even in hospital during the period of his last illness. The changes that he proposed 
are made in the present edition. Of these we should mention some revision of the proof of 
the law of conservation of angular momentum in relativistic mechanics, and also a more 
detailed discussion of the question of symmetry of the Christoffel symbols in the theory of 
gravitation. The sign has been changed in the definition of the electromagnetic field stress 
tensor. (In the present edition this tensor was defined differently than in the other volumes 
of the Course.) 


June 1987 


L. P. Pitaevskii 



NOTATION 


Three-dimensional quantities 

Three-dimensional tensor indices are denoted by Greek letters 
Element of volume, area and length: dV, df, d\ 

Momentum and energy of a particle: p and d 
Hamiltonian function: 3f 

Scalar and vector potentials of the electromagnetic field: <j) and A 

Electric and magnetic field intensities: E and H 

Charge and current density: p and j 

Electric dipole moment: d 

Magnetic dipole moment: «. 


Four-dimensional quantities 

Four-dimensional tensor indices are denoted by Latin letters i, k,l,... and take on the values 
0, 1, 2, 3 

We use the metric with signature (+-) 

Rule for raising and lowering indices—see p. 14 o 

Components of four-vectors are enumerated in the form A' = (A , A) 

Antisymmetric unit tensor of rank four is e Mm , where e 0123 = 1 (for the definition, see p. 17) 
Element of four-volume dQ. = dx°dx i dx 1 (hc' 

Element of hypersurface dS‘ (defined on pp. 20-21) 

Radius four-vector: x l = (ct, r) 

Velocity four-vector: u‘ = dx'/ds 
Momentum four-vector: p = (die, p) 

Current four-vector: j‘ = (cp, pv) 

Four-potential of the electromagnetic field: A 1 = (<p. A) 

Electromagnetic field four-tensor F ik = -=A- - (for the relation of the components of 

F lk to the components of E and H, see p. 65) 

Energy-momentum four-tensor F'^for the definition of its components, see p. 83) 
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THE PRINCIPLE OF RELATIVITY 


§ 1. Velocity of propagation of interaction 

For the description of processes taking place in nature, one must have a system of reference. 
By a system of reference we understand a system of coordinates serving to indicate the 
position of a particle in space, as well as clocks fixed in this system serving to indicate the 
time. 

There exist systems of reference in which a freely moving body, i.e. a moving body which 
is not acted upon by external forces, proceeds with constant velocity. Such reference systems 
are said to be inertial. 

If two reference systems move uniformly relative to each other, and if one of them is an 
inertial system, then clearly the other is also inertial (in this system too every free motion 
will be linear and uniform). In this way one can obtain arbitrarily many inertial systems of 
reference, moving uniformly relative to one another. 

Experiment shows that the so-called principle of relativity is valid. According to this 
principle all the laws of nature are identical in all inertial systems of reference. In other 
words, the equations expressing the laws of nature are invariant with respect to transformations 
of coordinates and time from one inertial system to another. This means that the equation 
describing any law of nature, when written in terms of coordinates and time in different 
inertial reference systems, has one and the same form. 

The interaction of material particles is described in ordinary mechanics by means of a 
potential energy of interaction, which appears as a function of the coordinates of the interacting 
particles. It is easy to see that this manner of describing interactions contains the assumption 
of instantaneous propagation of interactions. For the forces exerted on each of the particles 
by the other particles at a particular instant of time depend, according to this description, 
only on the positions of the particles at this one instant. A change in the position of any of 
the interacting particles influences the other particles immediately. 

However, experiment shows that instantaneous interactions do not exist in nature. Thus a 
mechanics based on the assumption of instantaneous propagation of interactions contains 
within itself a certain inaccuracy. In actuality, if any change takes place in one of the 
interacting bodies, it will influence the other bodies only after the lapse of a certain interval 
of time. It is only after this time interval that processes caused by the initial change begin 
to take place in the second body. Dividing the distance between the two bodies by this time 
interval, we obtain the velocity of propagation of the interaction. 

We note that this velocity should, strictly speaking, be called the maximum velocity of 
propagation of interaction. It determines only that interval of time after which a change 
occurring in one body begins to manifest itself in another. It is clear that the existence of a 
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maximum velocity of propagation of interactions implies, at the same time, that motions of 
bodies with greater velocity than this are in general impossible in nature. For if such a 
motion could occur, then by means of it one could realize an interaction with a velocity 
exceeding the maximum possible velocity of propagation of interactions. 

Interactions propagating from one particle to another are frequently called “signals”, sent 
out from the first particle and “informing” the second particle of changes which the first has 
experienced. The velocity of propagation of interaction is then referred to as the signal 
velocity. 

From the principle of relativity it follows in particular that the velocity of propagation of 
interactions is the same in all inertial systems of reference. Thus the velocity of propagation 
of interactions is a universal constant. This constant velocity (as we shall show later) is also 
the velocity of light in empty space. The velocity of light is usually designated by the letter 
c, and its numerical value is 


c = 2.998 x 10 10 cm/sec. (1.1) 

The large value of this velocity explains the fact that in practice classical mechanics 
appears to be sufficiently accurate in most cases. The velocities with which we have occasion 
to deal are usually so small compared with the velocity of light that the assumption that the 
latter is infinite does not materially affect the accuracy of the results. 

The combination of the principle of relativity with the finiteness of the velocity of propagation 
of interactions is called the principle of relativity of Einstein (it was formulated by Einstein 
in 1905) in contrast to the principle of relativity of Galileo, which was based on an infinite 
velocity of propagation of interactions. 

The mechanics based on the Einsteinian principle of relativity (we shall usually refer to it 
simply as the principle of relativity) is called relativistic. In the limiting case when the 
velocities of the moving bodies are small compared with the velocity of light we can neglect 
the effect on the motion of the finiteness of the velocity of propagation. Then relativistic 
mechanics goes over into the usual mechanics, based on the assumption of instantaneous 
propagation of interactions; this mechanics is called Newtonian or classical. The limiting 
transition from relativistic to classical mechanics can be produced formally by the transition 
to the limit c -» °° in the formulas of relativistic mechanics. 

In classical mechanics distance is already relative, i.e. the spatial relations between different 
events depend on the system of reference in which they are described. The statement that 
two nonsimultaneous events occur at one and the same point in space or, in general, at a 
definite distance from each other, acquires a meaning only when we indicate the system of 
reference which is used. 

On the other hand, time is absolute in classical mechanics; in other words, the properties 
of time are assumed to be independent of the system of reference; there is one time for all 
reference frames. This means that if any two phenomena occur simultaneously for any one 
observer, then they occur simultaneously also for all others. In general, the interval of time 
between two given events must be identical for all systems of reference. 

It is easy to show, however, that the idea of an absolute time is in complete contradiction 
to the Einstein principle of relativity. For this it is sufficient to recall that in classical mechanics, 
based on the concept of an absolute time, a general law of combination of velocities is valid, 
according to which the velocity of a composite motion is simply equal to the (vector) sum 
of the velocities which constitute this motion. This law, being universal, should also be 
applicable to the propagation of interactions. From this it would follow that the velocity of 
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propagation must be different in different inertial systems of reference, in contradiction to 
the principle of relativity. In this matter experiment completely confirms the principle of 
relativity. Measurements first performed by Michelson (1881) showed complete lack of 
dependence of the velocity of light on its direction of propagation; whereas according to 
classical mechanics the velocity of light should be smaller in the direction of the earth’s 
motion than in the opposite direction. 

Thus the principle of relativity leads to the result that time is not absolute. Time elapses 
differently in different systems of reference. Consequently the statement that a definite time 
interval has elapsed between two given events acquires meaning only when the reference 
frame to which this statement applies is indicated. In particular, events which are simultaneous 
in one reference frame will not be simultaneous in other frames. 

To clarify this, it is instructive to consider the following simple example. 

Let us look at two inertial reference systems K and K' with coordinate axes XYZ and 
X' Y' Z' respectively, where the system K' moves relative to K along the X(X') axis (Fig. 1). 


Z Z' 



Suppose signals start out from some point A on the X’ axis in two opposite directions. 
Since the velocity of propagation of a signal in the K' system, as in all inertial systems, is 
equal (for both directions) to c, the signals will reach points B and C, equidistant from A, at 
one and the same time (in the K' system) 

But it is easy to see that the same two events (arrival of the signal at B and C) can by no 
means be simultaneous for an observer in the K system. In fact, the velocity of a signal 
relative to the K system has, according to the principle of relativity, the same value c, and 
since the point B moves (relative to the K system) toward the source of its signal, while the 
point C moves in the direction away from the signal (sent from A to C), in the K system the 
signal will reach point B earlier than point C. 

Thus the principle of relativity of Einstein introduces very drastic and fundamental changes 
in basic physical concepts. The notions of space and time derived by us from our daily 
experiences are only approximations linked to the fact that in daily life we happen to deal 
only with velocities which are very small compared with the velocity of light. 

§ 2. Intervals 

In what follows we shall frequently use the concept of an event. An event is described by 
the place where it occurred and the time when it occurred. Thus an event occurring in a 
certain material particle is defined by the three coordinates of that particle and the time when 
the event occurs. 

It is frequently useful for reasons of presentation to use a fictitious four-dimensional 
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space, on the axes of which are marked three space coordinates and the time. In this space 
events are represented by points, called world points. In this fictitious four-dimensional 
space there corresponds to each particle a cetain line, called a world line. The points of this 
line determine the coordinates of the particle at all moments of time. It is easy to show that 
to a particle in uniform rectilinear motion there corresponds a straight world line. 

We now express the principle of the invariance of the velocity of light in mathematical 
form. For this purpose we consider two reference systems K and K' moving relative to each 
other with constant velocity. We choose the coordinate axes so that the axes X and X' 
coincide, while the Y and Z axes are parallel to Y and Z'; we designate the time in the 
systems K and K' by t and t’. 

Let the first event consist of sending out a signal, propagating with light velocity, from a 
point having coordinates x^y^zi in the K system, at time q in this system. We observe the 
propagation of this signal in the K system. Let the second event consist of the arrival of the 
signal at point x 2 y 2 z 2 at the moment of time t 2 . The signal propagates with velocity c; 
the distance covered by it is therefore c(t\ -1 2 ). On the other hand, this same distance equals 
l(x 2 - x,) 2 + (y 2 - V|) 2 + (z. 2 - Z\ ) 2 ] 2 - Thus we can write the following relation between the 
coordinates of the two events in the K system: 

to - *i) 2 + to - yO 2 + to - Zi) 2 - c 2 (h - hf = 0. (2.1) 

The same two events, i.e. the propagation of the signal, can be observed from the K’ 
system: 

Let the coordinates of the first event in the K’ system be x[y[z[t [, and of the second: 
x 2 y 2 zto • Since the velocity of light is the same in the K and K' systems, we have, similarly 
to (2.1): 


(x 2 -x[) 2 + (y 2 -y{) 2 +(£2 - z{) 2 - c 2 (t 2 -t[) 2 =0. (2.2) 

If Xi V, z.\ t\ and x 2 y 2 z. 2 t 2 are the coordinates of any two events, then the quantity 

*12 = [c 2 to - 'i) 2 - to - W) 2 - to - yif - (z 2 - zi ) 2 ]* (2-3) 

is called the interval between these two events. 

Thus it follows from the principle of invariance of the velocity of light that if the interval 
between two events is zero in one coordinate system, then it is equal to zero in all other 
systems. 

If two events are infinitely close to each other, then the interval ds between them is 

ds 1 - c 2 dt 2 - dx 2 - dy 2 - dz 2 . (2.4) 

The form of expressions (2.3) and (2.4) permits us to regard the interval, from the formal 
point of view, as the distance between two points in a fictitious four-dimensional space 
(whose axes are labelled by x, y, z, and the product ct). But there is a basic difference 
between the rule for forming this quantity and the rule in ordinary geometry: in forming the 
square of the interval, the squares of the coordinate differences along the different axes are 
summed, not with the same sign, but rather with varying signs.f 

As already shown, if ds - 0 in one inertial system, then ds' = 0 in any other system. On 

t The four-dimensional geometry described by the quadratic form (2.4) was introduced by H. Minkowski, 
in connection with the theory of relativity. This geometry is called pseudo-euclidean, in contrast to ordinary 
euclidean geometry. 
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the other hand, ds and ds' are infinitesimals of the same order. From these two conditions 
it follows that ds 2 and ds' 2 must be proportional to each other: 

ds 2 = ads' 2 

where the coefficient a can depend only on the absolute value of the relative velocity of the 
two inertial systems. It cannot depend on the coordinates or the time, since then different 
points in space and different moments in time would not be equivalent, which would be in 
contradiction to the homogeneity of space and time. Similarly, it cannot depend on the 
direction of the relative velocity, since that would contradict the isotropy of space. 

Let us consider three reference systems K, K u K 2 , and let V, and V 2 be the velocities of 
systems K x and K 2 relative to K. We then have: 

ds 2 = a(V{)ds 2 , ds 2 = a(V 2 )ds 2 . 

Similarly we can write 

ds 2 = a(V n )dsl , 

where V n is the absolute value of the velocity of K 2 relative to K x . Comparing these 
relations with one another, we find that we must have 


JV 2 )_ 

«(V.) 


(2.5) 


But V l2 depends not only on the absolute values of the vectors V, and V 2 , but also on the 
angle between them. However, this angle does not appear on the left side of formula (2.5). 
It is therefore clear that this formula can be correct only if the function a(V) reduces to a 
constant, which is equal to unity according to this same formula. 

Thus, 


■' ds 2 = ds' 2 . 


( 2 . 6 ) 


and from the equality of the infinitesimal intervals there follows the equality of finite 
intervals: s - s'. 

Thus we arrive at a very important result: the interval between two events is the same in 
all inertial systems of reference, i.e. it is invariant under transformation from one inertial 
system to any other. This invariance is the mathematical expression of the constancy of the 
velocity of light. 

Again let A 1 y l z l f, and x 2 y 2 z 2 t 2 be the coordinates of two events in a certain reference 
system K. Does there exist a coordinate system K', in which these two events occur at one 
and the same point in space? 

We introduce the notation 


h - h = hi, (*2 - * 1 ) + (y 2 - yif + ( z 2 - z ,) 2 = ll\ 
Then the interval between events in the K system is: 

4 = c 2 t 2 2 - i 2 2 

and in the K' system 

v '2 _ r 2,'2 _ i/2 
a 12 _ c *12 *12 ’ 

whereupon, because of the invariance of intervals, 
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r 2 f 2 i2 _ r 2 f '2 f'2 

C M2 “ M2 _ C M2 - M2 • 

We want the two events to occur at the same point in the K' system, that is, we require 
l[ 2 = 0. Then 

4 = c2 4 - 4 = c 2 4 > o. 

Consequently a system of reference with the required property exists if s 2 2 > 0, that is, if the 
interval between the two events is a real number. Real intervals are said to be timelike. 

Thus, if the interval between two events is timelike, then there exists a system of reference 
in which the two events occur at one and the same place. The time which elapses between 
the two events in this system is 

<2 ' 7> 

If two events occur in one and the same body, then the interval between them is always 
timelike, for the distance which the body moves between the two events cannot be greater 
than ct n , since the velocity of the body cannot exceed c. So we have always 


lu < ct \2- 


Let us now ask whether or not we can find a system of reference in which the two events 
occur at one and the same time. As before, we have for the K and K' systems c 2 tf 2 - 4 = 

- l'i 2 - We want to have t[ 2 - 0, so that 

4 = - 4 2 < 0 - 

Consequently the required system can be found only for the case when the interval s l2 
between the two events is an imaginary number. Imaginary intervals are said to be spacelike. 

Thus if the interval between two events is spacelike, there exists a reference system in 
which the two events occur simultaneously. The distance between the points where the 
events occur in this system is 

l[ 2 = ^Iy 2 - c 2 tf 2 - is\ 2 . (2.8) 

The division of intervals into space- and timelike intervals is, because of their invariance, 
an absolute concept. This means that the timelike or spacelike character of an interval is 
independent of the reference system. 

Let us take some event O as our origin of time and space coordinates. In other words, in 
the four-dimensional system of coordinates, the axes of which are marked x, y, z, t, the world 
point of the event O is the origin of coordinates. Let us now consider what relation other 
events bear to the given event O. For visualization, we shall consider only one space 
dimension and the time, marking them on two axes (Fig. 2). Uniform rectilinear motion of 
a particle, passing through x = 0 at t = 0, is represented by a straight line going through O 
and inclined to the t axis at an angle whose tangent is the velocity of the particle. Since the 
maximum possible velocity is c, there is a maximum angle which this line can subtend with 
the t axis. In Fig. 2 are shown the two lines representing the propagation of two signals (with 
the velocity of light) in opposite directions passing through the event O (i.e. going through 
x = 0 at 1 = 0). All lines representing the motion of particles can lie only in the regions aOc 
and dOb. On the lines ab and cd, x = ± ct. First consider events whose world points lie 
within the region aOc. It is easy to show that for all the points of this region c 2 t 2 - x 2 > 0. 
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In other words, the interval between any event in this region and the event O is timelike. In 
this region t > 0, i.e. all the events in this region occur “after” the event O. But two events 
which are separated by a timelike interval cannot occur simultaneously in any reference 
system. Consequently it is impossible to find a reference system in which any of the events 
in region aOc occurred “before” the event O, i.e. at time / < 0. Thus all the events in region 
aOc are future events relative to O in all reference systems. Therefore this region can be 
called the absolute future relative to O. 



Fig. 2 

In exactly the same way, all events in the region bOd are in the absolute past relative to 
0; i.e. events in this region occur before the event O in all systems of reference. 

Next consider regions dOa and cOb. The interval between any event in this region and the 
event 0 is spacelike. These events occur at different points in space in every reference 
system. Therefore these regions can be said to be absolutely remote relative to O. However, 
the concepts “simultaneous”, “earlier”, and “later” are relative for these regions. For any 
event in these regions there exist systems of reference in which it occurs after the event O, 
systems in which it occurs earlier than O, and finally one reference system in which it occurs 
simultaneously with O. 

Note that if we consider all three space coordinates instead of just one, then instead of the 
two intersecting lines of Fig. 2 we would have a “cone” x 2 + y 2 + z 2 - c 2 t 2 = 0 in the four¬ 
dimensional coordinate system x, y, z, t, the axis of the cone coinciding with the t axis. (This 
cone is called the light cone.) The regions of absolute future and absolute past are then 
represented by the two interior portions of this cone. 

Two events can be related causally to each other only if the interval between them is 
timelike; this follows immediately from the fact that no interaction can propagate with a 
velocity greater than the velocity of light. As we have just seen, it is precisely for these 
events that the concepts “earlier” and “later” have an absolute significance, which is a 
necessary condition for the concepts of cause and effect to have meaning. 

§ 3. Proper time 

Suppose that in a certain inertial reference system we observe clocks which are moving 
relative to us in an arbitrary manner. At each different moment of time this motion can be 
considered as uniform. Thus at each moment of time we can introduce a coordinate system 
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rigidly linked to the moving clocks, which with the clocks constitutes an inertial reference 
system. 

In the course of an infinitesimal time interval dt (as read by a clock in our rest frame) the 
moving clocks go a distance sjdx 2 + dy 2 + dz 2 . Let us ask what time interval dt' is 
indicated for this period by the moving clocks. In a system of coordinates linked to the 
moving clocks, the latter are at rest, i.e., dx = dy' = dz' = 0. Because of the invariance of 
intervals 

ds 2 = c 2 dt 2 -dx 2 - dy 2 - dz 2 = c 2 dt' 2 , 

from which 


But 


dx 2 + dy 2 + dz 2 


dx 2 + dy 2 + dz 2 _ ^ 
dt 2 ~ V 


where v is the velocity of the moving clocks; therefore 


(3.1) 


Integrating this expression, we can obtain the time interval indicated by the moving clocks 
when the elapsed time according to a clock at rest is t 2 - t t : 


n 

t' 2 - t[ - J dt 

h 

The time read by a clock moving with a given object is called the proper time for this 
object. Formulas (3.1) and (3.2) express the proper time in terms of the time for a system of 
reference from which the motion is observed. 

As we see from (3.1) or (3.2), the proper time of a moving object is always less than the 
corresponding interval in the rest system. In other words, moving clocks go more slowly 
than those at rest. 

Suppose some clocks are moving in uniform rectilinear motion relative to an inertial 
system K. A reference frame K' linked to the latter is also inertial. Then from the point of 
view of an observer in the K system the clocks in the K' system fall behind. And conversely, 
from the point of view of the K' system, the clocks in K lag. To convince ourselves that there 
is no contradiction, let us note the following. In order to establish that the clocks in the K' 
system lag behind those in the K system, we must proceed in the following fashion. Suppose 
that at a certain moment the clock in K’ passes by the clock in K, and at that moment the 
readings of the two clocks coincide. To compare the rates of the two clocks in K and K' we 
must once more compare the readings of the same moving clock in K' with the clocks in K. 
But now we compare this clock with different clocks in K —with those past which the clock 
in K' goes at ths new time. Then we find that the clock in K' lags behind the clocks in K with 
which it is being compared. We see that to compare the rates of clocks in two reference 


(3.2) 
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frames we require several clocks in one frame and one in the other, and that therefore this 
process is not symmetric with respect to the two systems. The clock that appears to lag is 
always the one which is being compared with different clocks in the other system. 

If we have two clocks, one of which describes a closed path returning to the starting point 
(the position of the clock which remained at rest), then clearly the moving clock appears to 
lag relative to the one at rest. The converse reasoning, in which the moving clock would be 
considered to be at rest (and vice versa) is now impossible, since the clock describing a 
closed trajectory does not carry out a uniform rectilinear motion, so that a coordinate system 
linked to it will not be inertial. 

Since the laws of nature are the same only for inertial reference frames, the frames linked 
to the clock at rest (inertial frame) and to the moving clock (non-inertial) have different 
properties, and the argument which leads to the result that the clock at rest must lag is not valid. 

The time interval read by a clock is equal to the integral 

-c\ dS - 

taken along the world line of the clock. If the clock is at rest then its world line is clearly a 
line parallel to the t axis; if the clock carries out a nonuniform motion in a closed path and 
returns to its starting point, then its world line will be a curve passing through the two points, 
on the straight world line of a clock at rest, corresponding to the beginning and end of the 
motion. On the other hand, we saw that the clock at rest always indicates a greater time 
interval than the moving one. Thus we arrive at the result that the integral 

J ^ S ' 

taken between a given pair of world points, has its maximum value if it is taken along the 
straight world line joining these two points.! 

§ 4. The Lorentz transformation 

Our purpose is now to obtain the formula of transformation from one inertial reference 
system to another, that is, a formula by means of which, knowing the coordinates x, y, z, t, 
of a certain event in the K system, we can find the coordinates x', y, z, t' of the same event 
in another inertial system K'. 

In classical mechanics this question is resolved very simply. Because of the absolute 
nature of time we there have t = t’\ if, furthermore, the coordinate axes are chosen as usual 
(axes X, X' coincident, Y , Z axes parallel to Y', Z', motion along X, X') then the coordinates 
y, z clearly are equal to /, z, while the coordinates x and x' differ by the distance traversed 
by one system relative to the other. If the time origin is chosen as the moment when the two 
coordinate systems coincide, and if the velocity of the K' system relative to K is V, then this 
distance is Vt. Thus 

f It is assumed, of course, that the points a and b and the curves joining them are such that all elements 
ds along the curves are timelike. 

This property of the integral is connected with the pseudo-euclidean character of the four-dimensional 
geometry. In euclidean space the integral would, of course, be a minimum along the straight line. 
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x-x'+Vt, y = y', z = z', t-t'. (4.1) 

This formula is called the Galileo transformation. It is easy to verify that this transformation, 
as was to be expected, does not satisfy the requirements of the theory of relativity; it does 
not leave the interval between events invariant. 

We shall obtain the relativistic transformation precisely as a consequence of the requirement 
that it leaves the interval between events invariant. 

As we saw in § 2, the interval between events can be looked on as the distance between 
the corresponding pair of world points in a four-dimensional system of coordinates. 
Consequently we may say that the required transformation must leave unchanged all distances 
in the four-dimensional x, y, z, ct, space. But such transformations consist only of parallel 
displacements, and rotations of the coordinate system. Of these the displacement of the 
coordinate system parallel to itself is of no interest, since it leads only to a shift in the origin 
of the space coordinates and a change in the time reference point. Thus the required 
transformation must be expressible mathematically as a rotation of the four-dimensional x, 
y, z, ct, coordinate system. 

Every rotation in the four-dimensional space can be resolved into six rotations, in the 
planes xy, zy, xz, tx, ty, tz (just as every rotation in ordinary space can be resolved into three 
rotations in the planes xy, zy and xz). The first three of these rotations transform only the 
space coordinates; they correspond to the usual space rotations. 

Let us consider a rotation in the tx plane; under this, the y and z coordinates do not change. 
In particular, this transformation must leave unchanged the difference (ct) 2 - x 2 , the square 
of the “distance” of the point (ct, x) from the origin. The relation between the old and the 
new coordinates is given in most general form by the formulas: 

x = x' cosh yr + ct' sinh yr , ct = x' sinh y/ + ct' cosh yr, (4.2) 

where y/ is the “angle of rotation”; a simple check shows that in fact c 2 / 2 - x 2 = c 2 t' 2 - x' 2 . 
Formula (4.2) differs from the usual formulas for transformation under rotation of the 
coordinate axes in having hyperbolic functions in place of trigonometric functions. This is 
the difference between pseudo-euclidean and euclidean geometry. 

We try to find the formula of transformation from an inertial reference frame K to a system 
K' moving relative to K with velocity V along the x axis. In this case clearly only the 
coordinate x and the time t are subject to change. Therefore this transformation must have 
the form (4.2). Now it remains only to determine the angle yr, which can depend only on the 
relative velocity V'.'f 

Let us consider the motion, in the K system, of the origin of the K' system. Then x' = 0 and 
formulas (4.2) take the form: 

x = ct' sinh yr, ct - ct' cosh yr, 

or dividing one by the other, 


~ - tanh yr. 

But xlt is clearly the velocity V of the K' system relative to K. So 


t Note that to avoid confusion we shall always use V to signify the constant relative velocity of two 
inertial systems, and v for the velocity of a moving particle, not necessarily constant. 
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tanh !//=—. 


sinh y/ = 


cosh iff = - 


Substituting in (4.2), we find: 


y = /. * = *'• t = 


(4.3) 


This is the required transformation formula. It is called the Lorentz transformation, and is 
of fundamental importance for what follows. 

The inverse formulas, expressing a', /, z', t' in terms of a, y, z, t, are most easily obtained 
by changing Vto -V (since the K system moves with velocity -V relative to the K system). 
The same formulas can be obtained directly by solving equations (4.3) for a', /, z, t. 

It is easy to see from (4.3) that on making the transition to the limit c and classical 
mechanics, the formula for the Lorentz transformation actually goes over into the Galileo 

transformation. , „ , 

For V > c in formula (4.3) the coordinates a, t are imaginary; this corresponds to the tact 
that motion with a velocity greater than the velocity of light is impossible. Moreover, one 
cannot use a reference system moving with the velocity of light—in that case the denominators 
in (4.3) would go to zero. 

For velocities V small compared with the velocity of light, we can use in place of (4.3) the 
approximate formulas: 


a = x + Vt', y = y, z = z', t = t'+ - 


(4.4) 


Suppose there is a rod at rest in the K system, parallel to the X axis. Let its length, 
measured in this system, be Aa = a 2 - a, (a 2 and a, are the coordinates of the two ends of the 
rod in the K system). We now determine the length of this rod as measured in the K system. 
To do this we must find the coordinates of the two ends of the rod (a 2 and x x ) in this system 
at one and the same time t'. From (4.3) we find: 

a,' + Vt' x' 2 + Vt’ 

*i=-f=7T’ * 2= -. 


The length of the rod in the K’ system is Ax' = x' 2 - x {; subtracting a, from a 2 , we find 
Aa' 


The proper length of a rod is its length in a reference system in which it is at n 
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us denote it by /o = Ax, and the length of the rod in any other reference frame K' by /. 
Then 


l = lo 



(4.5) 


Thus a rod has its greatest length in the reference system in which it is at rest. Its length 
in a system in which it moves with velocity V is decreased by the factor ^1 - V 2 /c 2 . This 
result of the theory of relativity is called the Lorentz contraction. 

Since the transverse dimensions do not change because of its motion, the volume 7 r of a 
body decreases according to the similar formula 


y= 



(4.6) 


where Tfi is the proper volume of the body. 

From the Lorentz transformation we can obtain anew the results already known to us 
concerning the proper time (§ 3). Suppose a clock to be at rest in the K' system. We take two 
events occurring at one and the same point x\ /, z in space in the K' system. The time 
between these events in the K' system is At' = t' 2 - t{. Now we find the time At which 
elapses between these two events in the K system. From (4.3), we have 



or, subtracting one from the other. 



in complete agreement with (3.1). 

Finally we mention another general property of Lorentz transformations which distinguishes 
them from Galilean transformations. The latter have the general property of commutativity, 
i.e. the combined result of two successive Galilean transformations (with different velocities 
V t and V 2 ) does not depend on the order in which the transformations are performed. On the 
other hand, the result of two successive Lorentz transformations does depend, in general, on 
their order. This is already apparent purely mathematically from our formal description of 
these transformations as rotations of the four-dimensional coordinate system: we know that 
the result of two rotations (about different axes) depends on the order in which they are 
carried out. The sole exception is the case of transformations with parallel vectors Vi and V 2 
(which are equivalent to two rotations of the four-dimensional coordinate system about the 
same axis). 


§ 5. Transformation of velocities 

In the preceding section we obtained formulas which enable us to find from the coordinates 
of an event in one reference frame, the coordinates of the same event in a second reference 
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frame. Now we find formulas relating the velocity of a material particle in one reference 
system to its velocity in a second reference system. 

Let us suppose once again that the K' system moves relative to the K system with velocity 
V along the a axis. Let v x = dx/dt be the component of the particle velocity in the K system 
and V x = dx'/dt' the velocity component of the same particle in the K’ system. From (4.3), 
we have 


dx = 


dx' + Vdt' 



dy-dy', dz = dz'. 



Dividing the first three equations by the fourth and introducing the velocities 


dr 

Tt' 


dr' 
dt' ’ 


we find 



These formulas determine the transformation of velocities. They describe the law of composition 
of velocities in the theory of relativity. In the limiting case of c -» °°, they go over into the 
formulas v x = V x + V, v y , = V y , v z = V z of classical mechanics. 

In the special case of motion of a particle parallel to the X axis, v x = v, v y = v z = 0. 
Then V y = = 0, V x = V, so that 


v = 


(5.2) 


It is easy to convince oneself that the sum of two velocities each smaller than the velocity 
of light is again not greater than the light velocity. 

For a velocity V significantly smaller than the velocity of light (the velocity v can be 
arbitrary), we have approximately, to terms of order Vic: 




Vy = V y - V Z = V' Z ~ V X V Z - 


These three formulas can be written as a single vector formula 


v = v' + V —-t(V • v')v'. (5-3) 

c 

We may point out that in the relativistic-law of addition of velocities (5.1) the two velocities 
v ' and V which are combined enter unsymmetrically (provided they are not both directed 
along the a axis). This fact is related to the noncommutativity of Lorentz transformations 
which we mentioned in the preceding section. 

Let us choose our coordinate axes so that the velocity of the particle at the given moment 
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lies in the XY plane. Then the velocity of the particle in the K system has components v x = 
v cos ft v y - v sin ft and in the K' system V x - V cos 6\ V y = V sin ft ( v, V, ft ft are the 
absolute values and the angles subtended with the X, X' axes respectively in the K, K’ 
systems). With the help of formula (5.1), we then find 


(5.4) 


This formula describes the change in the direction of the velocity on transforming from 
one reference system to another. 

Let us consider a very important special case of this formula, namely, the deviation of 
light in transforming to a new reference system—a phenomenon known as the aberration of 
light. In this case v = V = c, so that the preceding formula goes over into 


J 1 - 


V 2 


- sin ft. 


— + cos ft 
c 

From the same transformation formulas (5.1) it is easy to obtain for sin 6 and cos ft 


(5.5) 


1 + 


- sin ft, cos 6 = - 


cos 6' - 


1 + - 


In case V « c, we find from this formula, correct to terms of order Vic: 


sin 6 - sin 6' - sin 6' cos 6'. 
c 

Introducing the angle A 6 = 6' - 6 (the aberration angle), we find to the same order of 
accuracy 

A6 = sin 6', (5.7) 

which is the well-known elementary formula for the aberration of light. 


§ 6. Four-vectors 

The coordinates of an event ( ct, x, y, z) can be considered as the components of a four¬ 
dimensional radius vector (or, for short, a four-radius vector) in a four-dimensional space. 
We shall denote its components by x‘, where the index i takes on the values 0, 1,2, 3, and 

x° = ct, x l -x, x 2 = y, x 3 = z. 

The square of the “length” of the radius four-vector is given by 
(x 0 ) 2 - (X 1 ) 2 - (x 2 ) 2 - (X 3 ) 2 . 

It does not change under any rotations of the four-dimensional coordinate system, in particular 
under Lorentz transformations. 
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In general a set of four quantities A 0 , A 1 , A 1 , A 3 which transform like the components of 
the radius four-vector x' under transformations of the four-dimensional coordinate system is 
called a four-dimensional vector (four-vector) A'. Under Lorentz transformations, 



The square magnitude of any four-vector is defined analogously to the square of the radius 
four-vector: 

(A 0 ) 2 - (A 1 ) 2 - (A 2 ) 2 - (A 3 ) 2 . 

For convenience of notation, we introduce two “types” of components of four-vectors, 
denoting them by the symbols A' and A„ with superscripts and subscripts. These are related 
by 

A 0 = A°, Aj = - A 1 , A 2 = -A 2 , A 3 = - A 3 . (6.2) 

The quantities A‘ are called the contravariant, and the A,- the covariant components of the 
four-vector. The square of the four-vector then appears in the form 


LA'A; = A°A 0 + A 1 A 1 +A 2 A 2 + A 3 A 3 . 

Such sums are customarily written simply as A'A„ omitting the summation sign. One 
agrees that one sums over any repeated index, and omits the summation sign. Of the pair of 
indices, one must be a superscript and the other a subscript. This convention for summation 
over “dummy” indices is very convenient and considerably simplifies the writing of formulas. 

We shall use Latin letters /, k,l, ... , for four-dimensional indices, taking on the values 0, 
1, 2, 3. 

In analogy to the square of a four-vector, one forms the scalar product of two different 
four-vectors: 

A% = A°B 0 + A l B x + A 2 B 2 + A 3 B 3 . 

It is clear that this can be written either as A% or A0 —the result is the same. In general one 
can switch upper and lower indices in any pair of dummy indices.! 

The product A'B, is a four-scalar —it is invariant under rotations of the four-dimensional 
coordinate system. This is easily verified directly,! but it is also apparent beforehand (from 
the analogy with the square A'A,) from the fact that all four-vectors transform according to 
the same rule. 

t In the literature the indices are often omitted on four-vectors, and their squares and scalar products are 
written as A 2 , AB. We shall not use this notation in the present text. 

t One should remember that the law for transformation of a four-vector expressed in covanant components 
differs (in signs) from the same law expressed for contravariant components. Thus, instead of (6.1), one will 




16 


THE PRINCIPLE OF RELATIVITY § 6 

The component A 0 is called the time component, and A 1 , A 2 , A 3 the space components of 
the four-vector (in analogy to the radius four-vector). The square of a four-vector can be 
positive, negative, or zero; such vectors are called, timelike, spacelike, and null-vectors, 
respectively (again in analogy to the terminology for intervals).! 

Under purely spatial rotations (i.e. transformations not affecting the time axis) the three 
space components of the four-vector A' form a three-dimensional vector A. The time component 
of the four-vector is a three-dimensional scalar (with respect to these transformations). In 
enumerating the components of a four-vector, we shall often write them as 


A' = (A 0 , A). 

The covariant components of the same four-vector are A, = (A 0 , - A), and the square of the 
four-vector is A'A, = (A 0 ) 2 - A 2 . Thus, for the radius four-vector: 

x ' = (ct , r), Xi = (ct , -r), x i x i = c 2 t 2 - r 2 . *! 1 

For three-dimensional vectors (with coordinates x, y, z) there is no need to distinguish 
between contra- and covariant components. Whenever this can be done without causing 
confusion, we shall write their components as A a (a=x, y, z) using Greek letters for subscripts. 
In particular we shall assume a summation over x, y, z for any repeated index (for example 
A • B = A a B a ). F ’ 


A four-dimensional tensor (four-tensor) of the second rank is a set of sixteen quantities 
A' , which under coordinate transformations transform like the products of components of 
two four-vectors. We similarly define four-tensors of higher rank. 

The components of a second-rank tensor can be written in three forms: covariant, A ik , 
contravariant, A'\ and mixed, A‘ k (where, in the last case, one should distinguish between 
Af and A, , i.e. one should be careful about which of the two is superscript and which a 
subscript). The connection between the different types of components is determined from 
the general rule: raising or lowering a space index (1,2,3) changes the sign of the component, 
while raising or lowering the time index (0) does not. Thus: 


A 00 = A 00 , A 01 = - A 01 , A„=A n , ..., 

A 0 ° = A 00 , Ao 1 = A 01 , Aj° = - A 01 , A, 1 = - A 11 , .... 

Under purely spatial transformations, the nine quantities A 11 , A 12 , ... form a three-tensor. 
The three components A 01 , A 02 , A 03 and the three components A 10 , A 20 , A 30 constitute three- 
dimensional vectors, while the component A 00 is a three-dimensional scalar. 

A tensor A ,k is said to be symmetric if A ik = A ki , and antisymmetric if A ik = - A ki . In an 
antisymmetric tensor, all the diagonal components (i.e. the components A 00 , A 11 , . . .) are 
zero, since, for example, we must have A 00 = - A 00 . For a symmetric tensor A* the mixed 
components A‘ k and A k obviously coincide; in such cases we shall simply write A‘ k , putting 
the indices one above the other. 

In every tensor equation, the two sides must contain identical and identically placed (i.e. 
above or below) free indices (as distinguished from dummy indices). The free indices in 
tensor equations can be shifted up or down, but this must be done simultaneously in all terms 
in the equation. Equating covariant and contravariant components of different tensors is 
“illegal”; such an equation, even if it happened by chance to be valid in a particular reference 
system, would be violated on going to another frame. 


t Null vectors are also said to be isotropic. 
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From the tensor components A ,k one can form a scalar by taking the sum 


A\ 


A ° 0 + A 1 1 + A 2 2 + A 3 3 


(where, of course, A,' = A',). This sum is called the trace of the tensor, and the operation for 
obtaining it is called contraction. 

The formation of the scalar product of two vectors, considered earlier, is a contraction 
operation: it is the formation of the scalar A'B, from the tensor A'B k . In general, contracting 
on any pair of indices reduces the rank of the tensor by 2. For example, A'*/,- is a tensor of 
second rank A\B k is a four-vector, A ,k ik is a scalar, etc. 

The unit four-tensor 8j, satisfies the condition that for any four-vector A', 

8 k A‘=A k . (6.3) 

It is clear that the components of this tensor are 


(i, « 

| 0 , if 


i = k 
i * k 


(6.4) 


Its trace is 8] = 4. 

By raising the one index or lowering the other in 8 k , we can obtain the contra- or 
covariant tensor g ik or g ih which is called the metric tensor. The tensors g ,k and g ik have 


identical components, which can be written a 

s a matrix: 



( 1 

0 

0 

O ' 

(g ik ) - (gut) - 

0 

-1 0 

0 

0 

0 -1 

0 


,0 

0 

0 

-l y 


(the index i labels the rows, and k the columns, in the order 0, 1,2, 3). It is clear that 

g ik A k = A h g ik A k = A'. ( 6 . 6 ) 

The scalar product of two four-vectors can therefore be written in the form: 

A'A, = g ik A‘A k = g ,k A,A k . (6.7) 

The tensors 8 k , g ik , g ik are special in the sense that their components are the same in all 
coordinate systems. The completely antisymmetric unit tensor of fourth rank, e lklm , has the 
same property. This is the tensor whose components change sign under interchange of any 
pair of indices, and whose nonzero components are ±1. From the antisymmetry it follows 
that all components in which two indices are the same are zero, so that the only non¬ 
vanishing components are those for which all four indices are different. We set 

e m23 = +l ( 6 - 8 ) 

(hence £> 0,23 = -1). Then all the other nonvanishing components e ,klm are equal to +1 or -1, 
according as the numbers i, k, l, m can be brought to the arrangement 0, 1, 2, 3 by an even 
or an odd number of transpositions. The number of such components is 4! = 24. Thus, 

e iklm ej k i m = -24. (6.9) 
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With respect to rotations of the coordinate system, the quantities e ,klm behave like the 
components of a tensor; but if we change the sign of one or three of the coordinates the 
components e ,klm , being defined as the same in all coordinate systems, do not change, 
whereas some of the components of a tensor should change sign. Thus e' klm is, strictly 
speaking, not a tensor, but rather a pseudotensor. Pseudotensors of any rank, in particular 
pseudoscalars, behave like tensors under all coordinate transformations except those that 
cannot be reduced to rotations, i.e. reflections, which are changes in sign of the coordinates 
that are not reducible to a rotation. 

The products e lklm e P rsl form a four-tensor of rank 8, which is a true tensor; by contracting 
on one or more pairs of indices, one obtains tensors of rank 6, 4, and 2. All these tensors 
have the same form in all coordinate systems. Thus their components must be expressed as 
combinations of products of components of the unit tensor d ‘ k — the only true tensor whose 
components are the same in all coordinate systems. These combinations can easily be found 
by starting from the symmetries that they must possess under permutation of indices.! 

If A ,k is an antisymmetric tensor, the tensor A' k and the pseudotensor A* ,k = \ e ,klm A lm are 
said to be dual to one another. Similarly, e' klm A m is an antisymmetric pseudotensor of rank 
3, dual to the vector A 1 . The product A ,k A* k of dual tensors is obviously a pseudoscalar. 

In this connection we note some analogous properties of three-dimensional vectors and 
tensors. The completely antisymmetric unit pseudotensor of rank 3 is the set of quantities 
e a p r which change sign under any transposition of a pair of indices. The only nonvanishing 
components of e a p y are those with three different indices. We set e xyz = 1; the others are 1 or 
-1, depending on whether the sequence a , fi , yean be brought to the order x, y, z by an even 
or an odd number of transpositions.! 


t For reference we give the following formulas: 


S‘ p 

Si 

si 

si 
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S k r 

s k 

s k 

eikln , e 

s k si s s k 

S' p 

S‘ r 

si 

si 


S‘ p S‘ r si 

K 

ST 

ST 

ST 




e' klm e prlm = - 2(S‘ p S k r - S' r S k ), e mm e prtm = - 6<%. 

The overall coefficient in these formulas can be checked using the result of a complete contraction, which 
should give (6.9). 

As a consequence of these formulas we have: 

e prs 'A, p A tr A,y4 m , = - Ae iklm 
e Mm d ,rs ' A , pAkr A lsAm , = 24A. 

where A is the determinant formed from the quantities A,*. 

t The fact that the components of the four-tensor c Mm are unchanged under rotations of the four-dimensional 
coordinate system, and that the components of the three-tensor e a p y are unchanged by rotations of the space 
axes are special cases of a general rule: any completely antisymmetric tensor of rank equal to the number 
of dimensions of the space in which it is defined is invariant under rotations of the coordinate system in the 
space. 
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The products e a p y e- K)lv form a true three-dimensional tensor of rank 6, and are therefore 
expressible as combinations of products of components of the unit three-tensor Sap .t 
Under a reflection of the coordinate system, i.e. under a change in sign of all the coordinates, 
the components of an ordinary vector also change sign. Such vectors are said to be polar. 
The components of a vector that can be written as the cross product of two polar vectors do 
not change sign under inversion. Such vectors are said to be axial. The scalar product of a 
polar and an axial vector is not a true scalar, but rather a pseudoscalar; it changes sign under 
a coordinate inversion. An axial vector is a pseudovector, dual to some antisymmetric 
tensor. Thus, if C = A x B, then 


C a = \e a[ir C pY , where C pr = A p B y -A y B p . 

Now consider four-tensors. The space components (f, k, = 1, 2, 3) of the antisymmetric 
tensor A ik form a three-dimensional antisymmetric tensor with respect to purely spatial 
transformations; according to our statement its components can be expressed in terms of the 
components of a three-dimensional axial vector. With respect to these same transformations 
the components A 01 , A 02 , A 03 form a three-dimensional polar vector. Thus the components of 
an antisymmetric four-tensor can be written as a matrix: 


(A*) = 


Px 


Py Pz 

0 - a z a y 

h 0 -°x 

lx a x 0 


( 6 . 10 ) 


where, with respect to spatial transformations, p and a are polar and axial vectors, respectively. 
In enumerating the components of an antisymmetric four-tensor, we shall write them in the 
form 


A ik = (p, a); 


then the covariant components of the same tensor are 
A,* = (-p» »)- 

Finally we consider certain differential and integral operations of four-dimensional tensor 
analysis. 

The four-gradient of a scalar <j) is the four-vector 


t For reference, we give the appropriate formulas: 


SaX 


e aPr e biv 


San 


S w 


Sax 


Spv 

Syx 


Contracting this tensor on one, two and three pairs of indices, we get: 

e aPy e Xny = SaxSfr ~ S^Sp^, 
e a pyem= 2S aX, 
e aPy e aPy = 6 - 
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We must remember that these derivatives are to be regarded as the covariant components of 
the four-vector. In fact, the differential of the scalar 


dx 1 


is also a scalar; from its form (scalar product of two four-vectors) our assertion is obvious. 

In general, the operators of differentiation with respect to the coordinates x‘, d/dx', should 
be regarded as the covariant components of the operator four-vector. Thus, for example, the 
divergence of a four-vector, the expression dA'Ichf, in which we differentiate the contravariant 
components A', is a scalar, f 

In three-dimensional space one can extend integrals over a volume, a surface or a curve. 
In four-dimensional space there are four types of integrations: 

(1) Integral over a curve in four-space. The element of integration is the line element, i.e. 
the four-vector dx 1 . 

(2) Integral over a (two-dimensional) surface in four-space. As we know, in three-space 
the projections of the area of the parallelogram formed from the vectors dr and dr' on the 
coordinate planes x„xp are dx a dx' p - dxpdx ' a . Analogously, in four-space the infinitesimal 
element of surface is given by the antisymmetric tensor of second rank df ,k = dx‘dx' k - 
dx k dx' l \ its components are the projections of the element of area on the coordinate planes. 
In three-dimensional space, as we know, one uses as surface element in place of the tensor 
df a p the vector df a dual to the tensor df a p: df a = j e n p y df Py . Geometrically this is a vector 
normal to the surface element and equal in absolute magnitude to the area of the element. In 
four-space we cannot construct such a vector, but we can construct the tensor df* ,k dual to 
the tensor df lk . 


df* ik =±e' k,m df lm . (6.11) 

Geometrically it describes an element of surface equal to and “normal” to the element of 
f If we differentiate with respect to the “covariant coordinates” x h then the derivatives 


d<j> 

d^~ 


' \_j>± 

c dt’ 


- V<j> 


form the contravariant components of a four-vector. We shall use this form only in exceptional cases [for 
example, for writing the square of the four-gradient ( d^dx')l(dtydx,) \. 

We note that in the literature partial derivatives with respect to the coordinates are often abbreviated 
using the symbols. 


d* = 


d 

dx, ’ 


d, 


d 

dx 


In this form of writing of the differentiation operators, the co- or contravariant character of quantities 
formed with them is explicit. This same advantage exists for another abbreviated form for writing derivatives, 
using the index preceded by a comma: 


«/- 


d<j> j d<j> 
dx' ’ dx, 
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surface df ,k \ all segments lying in it are orthogonal to all segments in the element df' k . It is 
obvious that df ,k df* k = 0. 

(3) Integral over a hypersurface, i.e. over a three-dimensional manifold. In three-dimensional 
space the volume of the parallelepiped spanned by three vectors is equal to the determinant 
of the third rank formed from the components of the vectors. One obtains analogously the 
projections of the volume of the parallelepiped (i.e. the “areas” of the hypersurface) spanned 
by three four-vectors dx\ dx”, dx"'\ they are given by the determinants 


dx'‘ 

dx' k 

dx' 1 


dx" 

dx" 1 

dx" 


which form a tensor of rank 3, antisymmetric in all three indices. As element of integration 
over the hypersurface, it is more convenient to use the four-vector dS', dual to the tensor 


dS ikL . 


dS‘ = -±e ikbn dS klm , dS klm = e nklm dS n . (6.12) 

Here 

d& = dS n \ dS 1 = dS 023 , . . . 

Geometrically dS' is a four-vector equal in magnitude to the “areas” of the hypersurface 
element, and normal to this element (i.e. perpendicular to all lines drawn in the hypersurface 
element). In particular, dS A) = dx dy dz, i.e. it is the element of three-dimensional volume dV, 
the projection of the hypersurface element on the hyperplane x° = const. 

(4) Integral over a four-dimensional volume; the element of integration is the scalar 

dLl = dx°dx'dx 2 dx 3 = cdtdV. (6.13) 

The element is a scalar: it is obvious that the volume of a portion of four-space is unchanged 
by a rotation of the coordinate system, f 

Analogous to the theorems of Gauss and Stokes in three-dimensional vector analysis, 
there are theorems that enable us to transform four-dimensional integrals. 

The integral over a closed hypersurface can be transformed into an integral over the four- 
volume contained within it by replacing the element of integration dS t by the operator 

dSi-^dO.—^. (614) 

dx‘ 

For example, for the integral of a vector A‘ we have: 

f Under a transformation from the integration variables a 0 , a 1 , a 2 , a 3 to new variables a' 0 , a *, a , a , the 
element of integration changes to J d£l\ where d£l' = dx' 0 dx' 1 dx' 2 dx’ 3 

, d(x'°,x' l ,x' 2 ,x' 3 ) 

d(x°,x\x 2 ,x 3 ) 

is the Jacobian of the transformation. For a linear transformation of the form a" = a‘ k x k , the Jacobian J 
coincides with the determinant I a’ k I and is equal to unity for rotations of the coordinate system; this shows 
the invariance of dCl. 
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(6.15) 


This formula is the generalization of Gauss’ theorem. 

An integral over a two-dimensional surface is transformed into an integral over the 
hypersurface “spanning” it by replacing the element of integration df* k by the operator 


(6.16) 


For example, for the integral of an antisymmetric tensor A* we have: 

The integral over a four-dimensional closed curve is transformed into an integral over the 
surface spanning it by the substitution: 


dx l -> df‘ 

Thus for the integral of a vector, we have: 


fa - d 
dx k ' 


(6.18) 




which is the generalization of Stokes’ theorem. 


PROBLEMS 

1. Find the law of transformation of the components of a symmetric four-tensor A ,k under Lorentz 
transformations (6.1). 

Solution: Considering the components of the tensor as products of components of two four-vectors, we 
get: 



and analogous formulas for A 33 , A 13 and A 03 . 
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2. The same for the antisymmetric tensor A 1 *. 

Solution: Since the coordinates jc 2 and x 3 do not change, the tensor component A 23 does not 
while the components A 12 , A 13 and A 02 , A 03 transform like x' and x°: 


change. 



and similarly for A 13 , A 03 . 

With respect to rotations of the two-dimensional coordinate system in the plane x x (which are the 
transformations we are considering) the components A 01 = - A 10 , A 00 = A 11 =0, form an antisymmetric of 
tensor of rank two, equal to the number of dimensions of the space. Thus, (see the remark on p. 19) these 
components are not changed by the transformations: 

A 01 = A' 01 . 


§ 7. Four-dimensional velocity 

From the ordinary three-dimensional velocity vector one can form a four-vector. This 
four-dimensional velocity (four-velocity ) of a particle is the vector 


dx l 


(7.1) 


To find its components, we note that according to (3.1), 
ds = cdt 1 - 

V c 

where v is the ordinary three-dimensional velocity of the particle. Thus 
dx v x 



(7.2) 


Note that the four-velocity is a dimensionless quantity. 2 

The components of the four-velocity are not independent. Noting that dx^x 1 = ds , we 
have 

u‘ Ui = 1. (7-3) 

Geometrically, u‘ is a unit four-vector tangent to the world line of the particle. 

Similarly to the definition of the four-velocity, the second derivative 


w’ 


d 2 x { _ 
ds 2 ~ ds 
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may be called the four-acceleration. Differentiating formula (7.3), we find: 

ii'W 1 = 0, (7.4) 

i.e. the four-vectors of velocity and acceleration are “mutually perpendicular”. 

PROBLEM 

Determine the relativistic uniformly accelerated motion, i.e. the rectilinear motion for which the acceleration 
w in the proper reference frame (at each instant of time) remains constant. 

Solution: In the reference frame in which the particle velocity is v = 0, the components of the four- 
acceleration vv' = (0, w/c 2 , 0, 0) (where iv is the ordinary three-dimensional acceleration, which is directed 
along the x axis). The relativistically invariant condition for uniform acceleration must be expressed by the 
constancy of the four-scalar which coincides with w 2 in the proper reference frame: 



In the “fixed” frame, with respect to which the motion is observed, writing out the expression for nAv, 
gives the equation 



Setting v = 0 for t = 0, we find that const = 0, so that 



Integrating once more and setting x = 0 for t = 0, we find: 



For wt « c, these formulas go over the classical expressions v=wt,x = wt 2 /!. For wt —» <*>, the velocity 
tends toward the constant value c. 

The proper time of a uniformly accelerated particle is given by the integral 



As t —> <*=, it increases much more slowly than t, according to the law clw In (2wtlc). 


CHAPTER 2 


RELATIVISTIC MECHANICS 


§ 8. The principle of least action 

In studying the motion of material particles, we shall start from the Principle of Least 
Action. The principle of least action is defined, as we know, by the statement that for each 
mechanical system there exists a certain integral S, called the action, which has a minimum 
value for the actual motion, so that its variation SS is zero.t 

To determine the action integral for a free material particle (a particle not under the 
influence of any external force), we note that this integral must not depend on our choice of 
reference system, that is, it must be invariant under Lorentz transformations. Then it follows 
that it must depend on a scalar. Furthermore, it is clear that the integrand must be a differential 
of the first order. But the only scalar of this kind that one can construct for a free particle is 
the interval ds, or a ds, where a is some constant. So for a free particle the action must have 
the form 

b 

S = - a J ds, 

where j b is an integral along the world line of the particle between the two particular 
events of the arrival of the particle at the initial position and at the final position at definite 
times t\ and t 2 , i.e. between two given world points; and a is some constant characterizing 
the particle. It is easy to see that a must be a positive quantity for all particles. In fact, as we 
saw in § 3, a \ b ds has its maximum value along a straight world line; by integrating along 
a curved world line we can make the integral arbitrarily small. Thus the integral a \ ds with 
the positive sign cannot have a minimum; with the opposite sign it clearly has a minimum, 
along the straight world line. 

The action integral can be represented as an integral with respect to the time 

The coefficient L of dt represents the Lagrange function of the mechanical system. With the 
aid of (3.1), we find: 

t Strictly speaking, the principle of least action asserts that the integral S must be a minimum only for 
infinitesimal lengths of the path of integration. For paths of arbitrary length we can say only that S must be 
an extremum, not necessarily a minimum. (See Mechanics, § 2.) 
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>1 


where vis the velocity of the material particle. Consequently the Lagrangian for the particle 
is 


L = - ac-Jl - v 2 /c 2 . 

The quantity a, as already mentioned, characterizes the particle. In classical mechanics 
each particle is characterized by its mass m. Let us find the relation between a and m. It can 
be determined from the fact that in the limit as c —» °°, our expression for L must go over into 
the classical expression L - mi^/2. To carry out this transition we expand L in powers of 
vie. Then, neglecting terms of higher order, we find 


av 1 

In¬ 


constant terms in the Lagrangian do not affect the equation of motion and can be omitted. 
Omitting the constant ac in L and comparing with the classical expression L = mv L H, we 
find that a - me. 

Thus the action for a free material point is 


and the Lagrangian is 


5 = 



ds 



( 8 . 1 ) 


( 8 . 2 ) 


§ 9. Energy and momentum 

By the momentum of a particle we can mean the vector p = dL/dv ( dL/dv is the symbolic 
representation of the vector whose components are the derivatives of L with respect to the 
corresponding components of v). Using (8.2), we find; 


P = 



(9.1) 


For small velocities (v«c) or, in the limit as c -> °o, this expression goes over into the 
classical p = my. For v = c, the momentum becomes infinite. 

The time derivative of the momentum is the force acting on the particle. Suppose the 
velocity of the particle changes only in direction, that is, suppose the force is directed 
perpendicular to the velocity. Then 


dp 

dt 


m dv 


(9.2) 
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If the velocity changes only in magnitude, that is, if the force is parallel to the velocity, then 


dp 

dt 



dw 
dt ' 


We see that the ratio of force to acceleration is different in the two cases. 
The energy dof the particle is defined as the quantity t 
d= p • v - L. 

Substituting the expressions (8.2) and (9.1) for L and p, we find 


(9.3) 


(9.4) 


This very important formula shows, in particular, that in relativistic mechanics the energy 
of a free particle does not go to zero for v= 0, but rather takes on a finite value 

*f = me 2 . (9-5) 



This quantity is called the rest energy of the particle. 

For small velocities (vie « 1), we have, expanding (9.4) in series in powers of v/c, 

me 2 + 

which, except for the rest energy, is the classical expression for the kinetic energy of a 
particle. 

We emphasize that, although we speak of a “particle”, we have nowhere made use of the 
fact that it is “elementary”. Thus the formulas are equally applicable to any composite body 
consisting of many particles, where by m we mean the total mass of the body and by v the 
velocity of its motion as a whole. In particular, formula (9.5) is valid for any body which is 
at rest as a whole. We call attention to the fact that in relativistic mechanics the energy of a 
free body (i.e. the energy of any closed system) is a completely definite quantity which is 
always positive and is directly related to the mass of the body. In this connection we recall 
that in classical mechanics the energy of a body is defined only to within an arbitrary 
constant, and can be either positive or negative. 

The energy of a body at rest contains, in addition to the rest energies of its constituent 
particles, the kinetic energy of the particles and the energy of their interactions with one 
another. In other words, me 2 is not equal to Dn/ 2 (where m a are the masses of the particles), 
and so m is not equal to J,m a . Thus in relativistic mechanics the law of conservation of mass 
does not hold: the mass of a composite body is not equal to the sum of the masses of its parts. 
Instead only the law of conservation of energy, in which the rest energies of the particles are 
included, is valid. 

Squaring (9.1) and (9.4) and comparing the results, we get the following relation between 
the energy and momentum of particle: 


t See Mechanics, § 6. 
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—y = p + me. 

The energy expressed in terms of the momentum is called the Hamiltonian function 

<%*= CtJp 2 + m 2 c 2 . (9.7) 


For low velocities, p « me, and we have approximately 


, p_ 

2m’ 


i.e., except for the rest energy we get the familiar classical expression for the Hamiltonian. 

From (9.1) and (9.4) we get the following relation between the energy, momentum, and 
velocity of a free particle: 

P = ^^-- (9.8) 

For v = c, the momentum and energy of the particle become infinite. This means that a 
particle with mass m different from zero cannot move with the velocity of light. Nevertheless, 
in relativistic mechanics, particles of zero mass moving with the velocity of light can exist.t 
From (9.8) we have for such particles: 

P = ~- (9.9) 

The same formula also holds approximately for particles with nonzero mass in the so-called 
ultrarelativistic case, when the particle energy d is large compared to its rest energy me 2 . 

We now write all our formulas in four-dimensional form. According to the principle of 
least action. 


SS = - mcS J ds = 0. 

To set up the expression for SS, we note that ds - ^dx-dx‘ and therefore 

SS=-mcj^^ = - m cju l dS X ‘. 

Integrating by parts, we obtain 

b 

SS = - mcUiSx' |* + me J Sx‘ ~^ds. (9.10) 

As we know, to get the equations of motion we compare different trajectories between the 
same two points, i.e. at the limits (Sx') a = (Sx‘) h = 0. The actual trajectory is then determined 


t For example, light quanta and neutrinos. 
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from the condition SS — 0. From (9.10) we thus obtain the equations duj/ds — 0; that is, a 
constant velocity for the free particle in four-dimensional form. 

To determine the variation of the action as a function of the coordinates, one must consider 
the point a as fixed, so that (Sx') a = 0. The second point is to be considered as variable, but 
only actual trajectories are admissible, i.e., those which satisfy the equations of motion. 
Therefore the integral in expression (9.10) for SS is zero. In place of (8x‘) h we may write 
simply 8x\ and thus obtain 

SS = - mcUiSx 1 . (9.11) 


The four-vector 


Pi = - 


dS 

dx‘ 


(9.12) 


is called the momentum four-vector. As we know from mechanics, the derivatives dS/dx, 
dS/dy, dS/dz are the three components of the momentum vector p of the particle, while the 
derivative -dSIdt is the particle energy 6. Thus the covariant components of the four-mementum- 
are p, - (die, - p), while the contravariant components aref 

p' = (die, p). (9.13) 

From (9.11) we see that the components of the four-momentum of a free particle are: 

p' = mcu‘. (9.14) 

Substituting the components of the four-velocity from (7.2), we see that we actually get 
expressions (9.1) and (9.4) for p and K 

Thus, in relativistic mechanics, momentum and energy are the components of a single 
four-vector. From this we immediately get the formulas for transformation of momentum 
and energy from one inertial system to another. Substituting (9.13) in the general formulas 
(6.1) for transformation of four-vectors, we find: 



where p x , p y , p z are the components of the three-dimensional vector p. 

From the definition (9.14) of the four-momentum, and the identity 
the square of the four-momentum of a free particle: 

PiP 1 = m 2 c 2 . 


= 1, we have, for 
(9.16) 


Substituting the expressions (9.13), we get back (9.6). 

By analogy with the usual definition of the force, the force four-vector is defined as the 
derivative: 


g‘ 


_ dy_ 


(9.17) 


t We call attention to a mnemonic for remembering the definition of the physical four-vectors: the 
contravariant components are related to the corresponding three-dimensional vectors (r for p for p‘) with 
the “right”, positive sign. 
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Its components satisfy the identity = 0. The components of this four-vector are expressed 
in terms of the usual three-dimensional force vector f = dp/dt: 



(9.18) 


The time component is related to the work done by the force. 

The relativistic Hamilton-Jacobi equation is obtained by substituting the derivatives 
-dSIdx 1 for pj in (9.16): 


dS dS 
dxi dx‘ 


or, writing the sum explicitly: 


ik dS dS 22 


IMSY _ (f^Y _ (9S) 2 _ (dsV _ 

2 \dt) J J v^ z J n 


(9.20) 


The transition to the limiting case of classical mechanics in equation (9.19) is made as 
follows. First of all we must notice that just as in the corresponding transition with (9.7), the 
energy of a particle in relativistic mechanics contains the term me 2 , which it does not in 
classical mechanics. Inasmuch as the action 5 is related to the energy by d = - ( dS/dt ), in 
making the transition to classical mechanics we must in place of 5 substitute a new action 
S' according to the relation: 

5 = 5'- mc 2 t. 

Substituting this in (9.20), we find 



In the limit as c -> °°, this equation goes over into the classical Hamilton-Jacobi equation. 


§ 10. Transformation of distribution functions 

In various physical problems we have to deal with distribution functions for the momenta 
of particles: f(p)dp x dp y dp z is the number of particles having momenta with components in 
given intervals dp x , dp y , dp z (or, as we say for brevity, the number of particles in a given 
volume element d 3 p = dp x dp y dp z in “momentum space”). We are then faced with the 
problem of finding the law of transformation of the distribution function /(p) when we 
transform from one reference system to another. 

To solve this problem, we first determine the properties of the “volume element” dp x dp y dp z 
with respect to Lorentz transformations. If we introduce a four-dimensional coordinate 
system, on whose axes are marked the components of the four-momentum of a particle, then 
dp x dp y dp z can be considered as the zeroth component of an element of the hypersurface 
defined by the equation p l p t - m 2 c 2 . The element of hypersurface is a four-vector directed 
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along the normal to the hypersurface; in our case the direction of the normal obviously 
coincides with the direction of the four-vector p r From this it follows that the ratio 


dp x dp y dp z 

F 


( 10 . 1 ) 


is an invariant quantity, since it is the ratio of corresponding components of two parallel 
four-vectors.t 

The number of particles, fdp x dp y dp z , is also obviously an invariant, since it does not 
depend on the choice of reference frame. Writing it in the form 


/( P)* 


dp x dp y dp z 

F 


and using the invariance of the ratio (10.1), we conclude that the product/(p)^is invariant. 
Thus the distribution function in the K' system is related to the distribution function in the 
K system by the formula 


/'(P') = 


/(p)y 


( 10 . 2 ) 


where p and d must be expressed in terms of p' and by using the transformation formulas 
(9.15). 

Let us now return to the invariant expression (10.1). If we introduce “spherical coordinates” 
in momentum space, the volume element dp x dp y dp z becomes p 2 dpdo, where do is the 
element of solid angle around the direction of the vector p. Noting that pdp = dd'dlc 1 
[from (9.6)], we have: 


p 2 dpdo _ pdif do 

F c 2 ’ 

Thus we find that the expression 

pd (f do (10.3) 

is also invariant. 

The notion of a distribution function appears in a different aspect in the kinetic theory of 
gases: the product/(r, p)dp x dp y dp z dV is the number of particles lying in a given volume 
element dV and having momenta in definite intervals dp x , dp y , dp z . The function/(r, p) is 


t The integration with respect to the element (10.1) can be expressed in four-dimensional form by means 
of the 5-function (cf. the footnote on p. 74) as an integration with respect to 

^S(pip‘ - m 2 c 2 )d 4 p, ifp = dp°dp l dp 2 dp i . (10.1a) 

The four components p l are treated as independent variables (with p° taking on only positive values). 
Formula (10.1a) is obvious from the following representation of the delta function appearing in it: 

S(p‘Pi-m 2 c 2 ) = 5^(p 0 ) 2 “frj = 'lf r [^( Po + f"] + 5 ( Po _ c^)] ’ < 101 *) 

where *f= c^Jp 2 + m 2 c 2 . This formula in turn follows from formula (V) of the footnote on p. 74. 
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called the distribution function in phase space (the space of the coordinates and momenta of 
the particle), and the product of differentials dr = d 3 p dV is the element of volume of this 
space. We shall find the law of transformation of this function. 

In addition to the two reference systems K and K', we also introduce the frame K 0 in which 
the particles with the given momentum are at rest; the proper volume dV 0 of the element 
occupied by the particles is defined relative to this system. The velocities of the systems K 
and K' relative to the system K 0 coincide, by definition, with the velocities v and V which 
these particles have in the systems K and K'. Thus, according to (4.6), we have 

dV = dV 0 ^ 1 - dV' = dV 0 

from which 

dV _ if' 
dV' if ' 

Multiplying this equation by the equation d^pld^p' = d’/if', we find that 

dT=dT / , (10.4) 

i.e. the element of phase volume is invariant. Since the number of particles / dr is also 
invariant, by definition, we conclude that the distribution function in phase space is an 
invariant: 

/ , (r / , pO =/(r, p), (10.5) 

where r', p' are related to r, p by the formulas for the Lorentz transformation. 

§ 11. Decay of particles 

Let us consider the spontaneous decay of a body of mass M into two parts with masses m x 
and m 2 . The law of conservation of energy in the decay, applied in the system of reference 
in which the body is at rest, givesf 

M = ^ 10 +^ 20 - ( 11 . 1 ) 

where and <f 20 are the energies of the emerging particles. Since rf l0 > m x and <f 20 > m 2 , the 
equality (11.1) can be satisfied only if M> m x + m 2 , i.e. a body can disintegrate spontaneously 
into parts the sum of whose masses is less than the mass of the body. On the other hand, if 
M <m\ + m 2 the body is stable (with respect to the particular decay) and does not decay 
spontaneously. To cause the decay in this case, we would have to supply to the body from 
outside an amount of energy at least equal to its “binding energy” ( m l + m 2 — M ). 

Momentum as well as energy must be conserved in the decay process. Since the initial 
momentum of the body was zero, the sum of the momenta of the emerging particles must be 
zero: p 10 + P 20 = 0- Consequently pf 0 = p 2(i , or 

t In §§ 11-13 we set c = 1. In other words the velocity of light is taken as the unit of velocity (so that 
the dimensions of length and time become the same). This choice is a natural one in relativistic mechanics 
and greatly simplifies the writing of formulas. However, in this book (which also contains a considerable 
amount of nonrelativistic theory) we shall not usually use this system of units, and will explicitly indicate 
when we do. 

If c has been set equal to unity in formulas, it is easy to convert back to ordinary units: the velocity is 
introduced to assure correct dimensions. 
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<^10 - m 2 = &20 - m\. 


( 11 . 2 ) 


The two equations (11.1) and (11.2) uniquely determine the energies of the emerging particles: 


M 2 - 


(11.3) 


In a certain sense the inverse of this problem is the calculation of the total energy M of two 
colliding particles in the system of reference in which their total momentum is zero. (This 
is abbreviated as the “system of the centre of inertia” or the “C-system”.) The computation 
of this quantity gives a criterion for the possible occurrence of various inelastic collision 
processes, accompanied by a change in state of the colliding particles, or the “creation” of 
new particles. A process of this type can occur only if the sum of the masses of the “reaction 
products” does not exceed M. 

Suppose that in the initial reference system (the “laboratory” system) a particle with mass 
mi and energy collides with a particle of mass m 2 which is at rest. The total energy of the 
two particles is 


&=&! +%’ 2 = &i + m 2 , 

and their total momentum is p = p, + p 2 = Pi- Considering the two particles together as a 
single composite system, we find the velocity of its motion as a whole from (9.8): 


y _ P _ Pi 

(f <?f i + 1712 


(11.4) 


This quantity is the velocity of the C-system with respect to the laboratory system (the L- 
system). 

However, in determining the mass M, there is no need to transform from one reference 
frame to the other. Instead we can make direct use of formula (9.6), which is applicable to 
the composite system just as it is to each particle individually. We thus have 

M 2 = 2 - p 2 = (if i + m 2 ) 2 - (^! 2 - ml ), 


from which 


M 2 = ml + m\ + 2m 2 %i. 


(11.5) 


PROBLEMS 


1. A particle moving with velocity V dissociates “in flight” into two particles. Determine the relation 
between the angles of emergence of these particles and their energies. 

Solution: Let / 0 be the energy of one of the decay particles in the C-system [i.e. 10 or ff 2 o > n (11-3)], 

if the energy of this same particle in the L-system, and 6 its angle of emergence in the L-system (with 
respect to the direction of V). By using the transformation formulas we find: 


so that 
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For the determination of / from cos 8 we then get the quadratic equation 

if 2 (1 - V 2 cos 2 8) - 2<fcf 0 V 1-V 2 + (1 - V 2 ) + V 2 m 2 cos 2 8 = 0, (2) 

which has one positive root (if the velocity v 0 of the decay particle in the C-system satisfies v n > V) or two 
positive roots (if v 0 < V). 

The source of this ambiguity is clear from the following graphical construction. According to (9.15), the 
momentum components in the L-system are expressed in terms of quantities referring to the C-system by 
the formulas 


Eliminating 6 0 , we get 


P y = Po sin • 


Py + (Px Vl - 


= Po- 


With respect to the variables p x . p y , this is the equation of an ellipse with semiaxes p 0 /-y/l - V 2 , p 0 , whose 
centre (the point O in Fig. 3) has been shifted a distance A 0 V7 Vl - V 2 from the point p = 0 (point A in 
Fig. 3).t 


(a) V< v 0 (b) V> v 0 



Fig. 3. 


If V > pijs o = v 0 , the point A lies outside the ellipse (Fig. 3b), so that for a fixed angle 8 the vector p (and 
consequently the energy /) can have two different values. It is also clear from the construction that in this 
case the angle 8 cannot exceed a definite value 0 max (corresponding to the position of the vector p in which 
it is tangent to the ellipse). The value of 0 max is most easily determined analytically from the condition that 
the discriminant of the quadratic equation (2) go to zero: 


2. Find the energy distribution of the decay particles in the L-system. 

Solution: In the C-system the decay particles are distributed isotropically in direction, i.e. the number of 
particles within the element of solid angle do 0 = 2 n sin 6 0 d0 o is 


The energy in the L-system is given in terms of quantities referring to the C-system by 
^ y 0 +poVcose 0 


and runs through the range of values from 

*o ~ v Po 

Vi - v 2 


t In the classical limit, the ellipse reduces to a circle. (See Mechanics, § 16.) 
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Expressing d I cos ; 0 o I in terms of dd, we obtain the normalized energy distribution (for each of the two 
types of decay particles): 


3. Determine the range of values in the L-system for the angle between the two decay particles (their 
separation angle) for the case of decay into two identical particles. 

Solution: In the C-system, the particles fly off in opposite directions, so that 0 U) = n- 0 2O = ®o- According 
to (5.4), the connection between angles in the C- and L-systems is given by the formulas: 


(since v 10 = v 20 = v 0 in the present case). The required separation angle is © = 
calculation gives: 


0 2 , and a simple 


V 2 - Vp + V 2 Vp si 


An examination of the extreme for this expression gives the following ranges of possible values of 0: 


for V < v () : 2 


&<n; 




n_. 

2’ 


for V > —t=^=: 0 < © < 2 ' 


4. Find the angular distribution in the L-system for decay particles of zero mass. 

Solution: According to (5.6) the connection between the angles of emergence in the C- and L-systems for 
particles with m = 0 is 


0 1 - V' cos 0' 

Substituting this expression in formula (1) of Problem 2, we find: 


dN= (l-V^do 

4tt( 1 - Vcos 0) 2 

5. Find the distribution of separation angles in the L-system for a decay into two particles of zero mass. 
Solution: The relation between the angles of emergence, 0 b 0 2 in the L-system and the angles 0i O = 0o, 
0 2O = n- 0 O in the C-system is given by (5.6), so that we have for the separation angle 0 = 0,+ 0 2 : 

2 l/ 2 - 1 - V 2 cos 2 0 O 
C0S " 1 - V 2 cos 2 0 o 


and conversely. 



Substituting this expression in formula (1) of problem 2, we find: 
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dN = 


1 - V 2 
16 xV sin 3 


do 



The angle © takes on values from n to © min = 2 cos -1 V. 

6. Determine the maximum energy which can be carried off by one of the decay particles, when a particle 
of mass M at rest decays into three particles with masses m t , m 2 , and m 3 . 

Solution: The particle ra, has its maximum energy if the system of the other two particles m 2 and m 3 has 
the least possible mass; the latter is equal to the sum m 2 + m, (and corresponds to the case where the two 
particles move together with the same velocity). Having thus reduced the problem to the decay of a body 
into two parts, we obtain from (11.3): 


- ( m 2 + m 3 ) 2 
2M 


§ 12. Invariant cross-section 

Collision processes are characterized by their invariant cross-sections, which determine the 
number of collisions (of the particular type) occurring between beams of colliding particles. 

Suppose that we have two colliding beams; we denote by n, and n 2 the particle densities 
in them (i.e. the numbers of particles per unit volume) and by v t and v 2 the velocities of the 
particles. In the reference system in which particle 2 is at rest (or, as one says, in the rest 
frame of particle 2), we are dealing with the collision of the beam of particles 1 with a 
stationary target. Then according to the usual definition of the cross-section a, the number 
of collisions occurring in volume dV in time dt is 

dv = ov rel n\n 2 dVdt. 

where v rel is the velocity of particle 1 in the rest system of particle 2 (which is just the 
definition of the relative velocity of two particles in relativistic mechanics). 

The number dvis by its very nature an invariant quantity. Let us try to express it in a form 
which is applicable in any reference system: 

dv = An^dVdt, (12.1) 

where A is a number to be determined, for which we know that its value in the rest frame of 
one of the particles is v rd a. We shall always mean by crprecisely the cross-section in the rest 
frame of one of the particles, i.e. by definition, an invariant quantity. From its definition, the 
relative velocity v rel is also invariant. 

In the expression (12.1) the product dVdt is an invariant. Therefore the product An x n 2 must 
also be an invariant. 

The law of transformation of the particle density n is easily found by noting that the 
number of particles in a given volume element dV, ndV, is invariant. Writing ndV = n () dV 0 
(the index 0 refers to the rest frame of the particles) and using formula (4.6) for the 
transformation of the volume, we find: 


n 


( 12 . 2 ) 


or n = n 0 ef lm, where / is the energy and m the mass of the particles. 

Thus the statement that An x n 2 is invariant is equivalent to the invariance of the expression 
Adi d 2 . This condition is more conveniently represented in the form 
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PuPi 


*1 #2 


(12.3) 


where the denominator is an invariant—the product of the four-momenta of the two particles. 

In the rest frame of particle 2, we have tf' 2 = m 2 , p 2 = 0, so that the invariant quantity (12.3) 
reduces to A. On the other hand, in this frame A = rrv rel . Thus in an arbitrary reference 
system. 


^ = <rv rel (12.4) 

0 6 2 

To give this expression its final form, we express v rel in terms of the momenta or velocities 
of the particles in an arbitrary reference frame. To do this we note that in the rest frame of 
particle 2, 


Then 


. m i 

PuPi = -- m 2 . 

A/l " rel 


Vrel = 


(PliPl) 


(12.5) 


Expressing the quantity p u p‘ 2 = # 2 ~ Pi • P 2 in t erm s of the velocities Vj and v 2 by using 

formulas (9.1) and (9.4): 


PuP ' 2=m 'm 2 

and substituting in (12.5), after some simple transformations we get the following expression 
for the relative velocity: 


V(v, -V Z ) 2 -<V,XV 2 ) 2 (126) 

rel 1 - Vj ■ v 2 

(we note that this expression is symmetric in and v 2 , i.e. the magnitude of the relative 
velocity is independent of the choice of particle used in defining it). 

Substituting (12.5) or (12.6) in (12.4) and then in (12.1), we get the final formulas for 
solving our problem: 

^](p u p' 2 ) 2 -mfm} 

dv= a- -——- rtjn 2 dVdt (12.7) 

n c 2 

or _ 

dv=a -y/(vi - v 2 ) 2 - (v! x v 2 ) 2 n^dVdt (12.8) 

(W. Pauli, 1933). 

If the velocities V[ and v 2 are collinear, then Vj x v 2 = 0, so that formula (12.8) takes the 
form: 


dv = a I Vj — v 2 I nin 2 dVdt. 


(12.9) 
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PROBLEM 


Find the “element of length” in relativistic “velocity space”. 

Solution: The required line element dl v is the relative velocity of two points with velocities v and v + d\. 
We therefore find from (12.6) 


dl 2 = ( rfv ) 2 -(vxdv) 2 
" (1-v^) 2 


dv 1 

(l-v 1 ) 2 


e-d<s> 2 ). 


where 6, tj> are the polar angle and azimuth of the direction of v. If in place of v we introduce the new 
variable x through the equation v = tanh x, the line element is expressed as: 


dll = dx 2 + sinh 2 x(d0 2 + sin 2 6 d<j> 2 ). 

From the geometrical point of view this is the line element in three-dimensional Lobachevskii space— 
the space of constant negative curvature (see (111.12)). 


§ 13. Elastic collisions of particles 

Let us consider, from the point of view of relativistic mechanics, the elastic collision of 
particles. We denote the momenta and energies of the two colliding particles (with masses 
m \ and m 2 ) by p b and p 2 , we use primes for the corresponding quantities after 
collision. The laws of conservation of momentum and energy in the collision can be written 
together as the equation for conservation of the four-momentum: 

P\ +P‘ 2 =P\ +P2- (13.1) 

From this four-vector equation we construct invariant relations which will be helpful in 
further computations. To do this we rewrite (13.1) in the form: 

P\ +Pi - Pi +P 2 , 

and square both sides (i.e. we write the scalar product of each side with itself). Noting that 
the squares of the four-momenta p{ and p\' are equal to mf , and the squares of p l 2 and p 2 
are equal to /nf, we get: 

"h 2 + PuPi ~ PuPi - PuPi = 0. (13.2) 

Similarly, squaring the equation p[ + p‘ 2 - p’ 2 ‘ = p[', we get: 

ml + PuPi - p\p’j - p u p 2 ‘ = 0. (13.3) 

Let us consider the collision in a reference system (the L-system) in which one of the 
particles (m 2 ) was at rest before the collision. Then p 2 = 0,<f 2 = m 2 , and the scalar products 
appearing in (13.2) are: 


PuP , 2 =dim 1 , 

PiiP'\‘ = m 2 d{, ( 13 . 4 ) 

PuPi = #i &i- Pi - Pi = ^i K-pip'i cos e u 

where 6 X is the angle of scattering of the incident particle m x . Substituting these expressions 
in (13.2) we get: 
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where 0 2 is the angle between the transferred momentum p 2 and the momentum of the 
incident particle pj. 

The formulas (13.5)—(13.6) relate the angles of scattering of the two particles in the L- 
system to the changes in their energy in the collision. Inverting these formulas, we can 
express the energies S\\ <t 2 in terms of the angles 0j or 6 2 . Thus, substituting in (13.6) 
Pl = JsJ- m }, p' 2 = J(&i) 2 - ml and squaring both sides, we find after a simple 


_ + m 2 ) 2 + (^i ~ m i ) cos 02 (13.7) 

° 2 " m2 (£f, + m 2 ) 2 - (^ 2 - m \) cos 2 0 2 

Inversion of formula (13.5) leads in the general case to a very complicated formula for 
in terms of 6 X . 

We note that if m, > m 2 , i.e. if the incident particle is heavier than the target particle, the 
scattering angle 6 X cannot exceed a certain maximum value. It is easy to find by elementary 
computations that this value is given by the equation 


which coincides with the familiar classical result. 

Formulas (13.5)—(13.6) simplify in the case when the incident particle has zero mass: m x 
= 0, and correspondingly p x - /(, p[ = (‘\- For this case let us write the formula for the 
energy of the incident particle after the collision, expressed in terms of its angle of deflection: 


Let us now turn once again to the general case of collision of particles of arbitrary mass. 
The collision is most simply treated in the C-system. Designating quantities in this system 
by the additional subscript 0, we have p 10 = - P20 - Po- From the conservation of momentum, 
during the collision the momenta of the two particles merely rotate, remaining equal in 
magnitude and opposite in direction. From the conservation of energy, the value of each of 
the momenta remains unchanged. 

Let X the angle of scattering in the C-system—the angle through which the momenta 
p l0 and p 20 are rotated by the collision. This quantity completely determines the scattering 
process in the C-system, and therefore also in any other reference system. It is also convenient 
in describing the collision in the L-system and serves as the single parameter which remains 
undetermined after the conservation of momentum and energy are applied. 

We express the final energies of the two particles in the L-system in terms of this parameter. 
To do this we return to (13.2), but this time write out the product p u p[‘ in the C-system: 
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P\iP'\‘ = <^10^10 - Pio ■ Pio = ^10 - Po cos X = pi (! - cos X) + rn\ 

(in the C-system the energies of the particles do not change in the collision: = *f 10 ). We 

write out the other two products in the L-system, i.e we use (13.4). As a result we get: 

- ( ‘\ = ~(Po /m 2 )(1 - cos^). We must still express p\ in terms of quantities referring 
to the L-system. This is easily done by equating the values of the invariant p u p ‘ 2 in the L- 
and C-systems: 


or %io%20 ~ Pio ‘ P20 = ^m 2 , 

■\j(.Po + m f)(pl + m l) = ^1 /w 2 - Po • 
Solving the equation for pi , we get: 

2 m l (^i 2 ~ m i ) 

0 ml + ml + 2m 2 Y x ’ 

Thus, we finally have: 


(13.10) 


m 2 (% 1 - mf) 


mf + m\ + 2 m 2 tf l 


(1 - cos x )• 


(13.11) 


The energy of the second particle is obtained from the conservation law: ^ + m 2 = 
Therefore 


^2 


m 2 + 


mf + ml + 2m 1 Y ] 


(1 - cos x )• 


(13.12) 


The second terms in these formulas represent the energy lost by the first particle and 
transferred to the second particle. The maximum energy transfer occurs for x = K* and is 
equal to 


^'max -m 2 =Y l - Y/ mr 


2 rn 2 (Hi 2 - mf ) 
ml + ml + 2 m 2 S[ 


(13.13) 


The ratio of the minimum kinetic energy of the incident particle after collision to its initial 
energy is: 


^1 min - m \ (m x - m 2 ) 2 

% - m i ml + ml + 2m 1 Y l ’ 


(13.14) 


In the limiting case of low velocities (when ef ~m + mv 1 / 2), this relation tends to a constant 
limit, equal to 


( m x — m 2 'j 2 
\m l + m 2 ) 

In the opposite limit of large energies tf[, relation (13.14) tends to zero; the quantity 
tends to a constant limit. This limit is 




ml + ml 


2 m : 
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Let us assume that m 2 » ffi\, i.e. the mass of the incident particle is small compared to the 
mass of the particle at rest. According to classical mechanics the light particle could transfer 
only a negligible part of its energy (see Mechanics, § 17). This is not the case in relativistic 
mechanics. From formula (13.14) we see that for sufficiently large energies rf 1 the fraction 
of the energy transferred can reach the order of unity. For this it is not sufficient that the 
velocity of m, be of order 1, but one must have if x ~ m 2 , i.e. the light particle must have an 
energy of the order of the rest energy of the heavy particle. 

A similar situation occurs for m 2 « rn x , i.e. when a heavy particle is incident on a light 
one. Here too, according to classical mechanics, the energy transfer would be insignificant. 
The fraction of the energy transferred begins to be significant only for energies <^, ~ m 2 /m 2 ■ 
We note that we are not taking simply of velocities of the order of the light velocity, but of 
energies large compared to m,, i.e. we are dealing with the ultrarelativistic case. 


PROBLEMS 

1. The triangle ABC in Fig. 4 is formed by the momentum vector p of the impinging particle and the 
momenta p|, p' 2 of the two particles after the collision. Find the locus of the points C corresponding to all 
possible values of pi, p 2 . 

Solution: The required curve is an ellipse whose semiaxes can be found by using the formulas obtained 
in problem 1 of § 11. In fact, the construction given there determined the locus of the vectors p in the L- 
system which are obtained from arbitrarily directed vectors p 0 with given length />„ in the C-system. 



Fig. 4. 


Since the absolute values of the momenta of the colliding particles are identical in the C-system, and do 
not change in the collision, we are dealing with a similar construction for the vector p(, for which 

m 2 V 

Po = P 10 - P20 - ^ ^ 2 " 

in the C-system where V is the velocity of particle m 2 in the C-system, coincides in magnitude with the 
velocity of the centre of inertia, and is equal to V = p,/(A 1 + m 2 ) (see (11.4)). As a result we find that the 
minor and major semiaxes of the ellipse are 


Po 


_ m 2 Pi _ 


Po m 2 pi(¥ l +m 2 ) 



(the first of these is, of course, the same as (13.10)). 

For 0) = 0, the vector pj coincides with Pi, so that the distance AB is equal top,. Comparing p x with the 
length of the major axis of the ellipse, it is easily shown that the point A lies outside the ellipse \im l >m 2 
(Fig. 4a), and inside it if mj < m 2 (Fig. 4b). 

2. Determine the minimum separation angle 0 m ; n of two particles after collision of the masses of the two 
particles are the same (m x = m 2 = m). 
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Solution: If m, = m 2 , the point A of the diagram lies on the ellipse, while the minimum separation angle 
corresponds to the situation where point C is at the end of the minor axis (Fig. 5). From the construction 
it is clear that tan (0 min /2) is the ratio of the lengths of the semiaxes, and we find: 


0 min 

tan 2 


I 2m 


or 


cos 0 min 


# l -m 
&i + 3 m • 


A 


Fig. 5. 



3. For the collision of two particles of equal mass m, express <f 2 \ x in terms of the angle 6, of 
scattering in the L-system. 

Solution: Inversion of formula (13.5) in this case gives: 

y,,_ m ( f ‘i + m) + (A, - m) cos 2 ^ f- m C^ 2 ~ m 2 ) sin 2 ^ 

(#i + m) - (A, - m) cos 2 ©, ’ " 2 + 2m + (^ - m) sin 2 F, ‘ 

Comparing with the expression for in terms of X - 

- ~ 2 m (1 - cos/), 

we find the angle of scattering in the C-system: 

cos x = 2m ~ + Sin2gl 

2m + (^ - m) sin 2 ©; ' 


§ 14. Angular momentum 

As is well known from classical mechanics, for a closed system, in addition to conservation 
of energy and momentum, there is conservation of angular momentum, that is, of the vector 

M = X r x p 

where r and p are the radius vector and momentum of the particle; the summation runs over 
all the particles making up the system. The conservation of angular momentum is a consequence 
of the fact that because of the isotropy of space, the Lagrangian of a closed system does not 
change under a rotation of the system as a whole. 

By carrying through a similar derivation in four-dimensional form, we obtain the relativistic 
expression for the angular momentum. Let *' be the coordinates of one of the particles of the 
system. We make an infinitesimal rotation in the four-dimensional space. Under such a 
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transformation, the coordinates x' take on new values x" such that the differences x l — x' are 
linear functions 

- x' = x k m ik (i4.i) 

with infinitesimal coefficients SQ^. The components of the four-tensor 8Q tk are connected 
to one another by the relations resulting from the requirement that, under a rotation, the 
length of the radius vector must remain unchanged, that is, x[x n - x,x‘. Substituting for x" 
from (14.1) and dropping terms quadratic in 8Q. ik , as infinitesimals of higher order, we find 

x'/SQ,* = 0. 

This equation must be fulfilled for arbitrary x'. Since x'x* is a symmetric tensor, 8Q. ik must 
be an antisymmetric tensor (the product of a symmetrical and an antisymmetrical tensor is 
clearly identically zero). Thus we find that 

8Q ki = -8Q ik . (14.2) 

The change in the action for an infinitesimal change of coordinates of the initial point a 
and the final point b of the trajectory has the form (see 9.11): 

8S = 

(the summation extends over all the particles of the system). In the case of rotation which 
we are now considering, 8x t = 8£l tk x k , and so 

5S = - sa ik X p i x k |\ 

If we resolve the tensor Xp'x* into symmetric and antisymmetric parts, then the first of 
these when multiplied by an antisymmetric tensor gives identically zero. Therefore, taking 
the antisymmetric part of X/?'/, we can write the preceding equality in the form 

8S = - 8Q ik \ X (p‘x k - p k x ‘) \ b a . (14.3) 

For a closed system the action, being an invariant, is not changed by a rotation in 4-space. 
This means that the coefficients of 8Q. ik in (14.3) must vanish: 

X(p'x t -p t A = KP f ^-P t A- 

Consequently we see that for a closed system the tensor 

M ik = Y.(x'p k - x k p‘) . (14.4) 

This antisymmetric tensor is called th e four-tensor of angular momentum. The space components 
of this tensor are the components of the three-dimensional angular momentum vector M = 
Xrxp: 

A/ 23 = M x , -M n =M y , A/ 12 = M z . 

The components A/ 01 , A/ 02 , A/ 03 form a vector X(fp - cfrlc 1 ). Thus, we can write the 
components of the tensor M ,k in the form: 

M ik -m|. 


(14.5) 
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(Compare (6.10).) 

Because of the conservation of M ,k for a closed system, we have, in particular. 


§ 14 



Since, on the other hand, the total energy X f-is also conserved, this equality can be written 
in the form 


2ifr 

2 if 


c^Zp 

2 if 


const. 


(Quantities referring to different particles are taken at the same time t). 
From this we see that the point with the radius vector 


moves uniformly with the velocity 


. 2 if r 
' 2 & 


(14.6) 


V 


c 2 Ip 
2 if ’ 


(14.7) 


which is none other than the velocity of motion of the system as a whole. [It relates the total 
energy and momentum, according to formula (9.8).] Formula (14.6) gives the relativistic 
definition of the coordinates of the centre of inertia of the system. If the velocities of all the 
particles are small compared to c, we can approximately set if- me 2 so that (14.6) goes over 
into the usual classical expression 


2 

2m ' 


t 


We note that the components of the vector (14.6) do not constitute the space components 
of any four-vector, and therefore under a transformation of reference frame they do not 
transform like the coordinates of a point. Thus we get different points for the centre of 
inertia of a given system with respect to different reference frames. 


PROBLEM 

Find the connection between the angular momentum M of a body (system of particles) in the reference 
frame K in which the body moves with velocity V, and its angular momentum M <0) in the frame K 0 in which 
the body is at rest as a whole; in both cases the angular momentum is defined with respect to the same 
point—the centre of inertia of the body in the system Kq.% 


t We note that whereas the classical formula for the centre of inertia applies equally well to interacting 
and non-interacting particles, formula (14.6) is valid only if we neglect interaction. In relativistic mechanics, 
the definition of the centre of inertia of a system of interacting particles requires us to include explicitly the 
momentum and energy of the field produced by the particles. 

i We remind the reader that although in the system K n (in which Ip = 0) the angular momentum is 
independent of the choice of the point with respect to which it is defined, in the K system (in which Ip * 
0) the angular momentum does depend on this choice (see Mechanics, § 9). 



ANGULAR MOMENTUM 


45 


§ 14 


Solution: The K 0 system moves relative to the K system with velocity V; we choose its direction for the 
x axis. The components of M ik that we want transform according to the formulas (see problem 2 in § 6): 



Since the origin of coordinates was chosen at the centre of inertia of the body (in the K 0 system), in that 
system I Kr = 0, and since in that system Ip = 0, Af* 0 ' 02 = Af< 0)03 = 0. Using the connection between the 
components of M lk and the vector M, we find for the latter: 


A/<°> 


M z 


A/f 


M z =Mf\ M y = 


CHAPTER 3 


CHARGES IN ELECTROMAGNETIC FIELDS 


§ 15. Elementary particles in the theory of relativity 

The interaction of particles can be described with the help of the concept of a field of 
force. Namely, instead of saying that one particle acts on another, we may say that the 
particle creates a field around itself; a certain force then acts on every other particle located 
in this field. In classical mechanics, the field is merely a mode of description of the physical 
phenomenon—the interaction of particles. In the theory of relativity, because of the finite 
velocity of propagation of interactions, the situation is changed fundamentally. The forces 
acting on a particle at a given moment are not determined by the positions at that same 
moment. A change in the position of one of the particles influences other particles only after 
the lapse of a certain time interval. This means that the field itself acquires physical reality. 
We cannot speak of a direct interaction of particles located at a distance from one another. 
Interactions can occur at any one moment only between neighbouring points in space (contact 
interaction). Therefore we must speak of the interaction of the one particle with the field, 
and of the subsequent interaction of the field with the second particle. 

We shall consider two types of fields, gravitational and electromagnetic. The study of 
gravitational fields is left to Chapters 10 to 14 and in the other chapters we consider only 
electromagnetic fields. 

Before considering the interactions of particles with the electromagnetic field, we shall 
make some remarks concerning the concept of a “particle” in relativistic mechanics. 

In classical mechanics one can introduce the concept of a rigid body, i.e., a body which is 
not deformable under any conditions. In the theory of relativity it should follow similarly 
that we would consider as rigid those bodies whose dimensions all remain unchanged in the 
reference system in which they are at rest. However, it is easy to see that the theory of 
relativity makes the existence of rigid bodies impossible in general. 

Consider, for example, a circular disk rotating around its axis, and let us assume that it is 
rigid. A reference frame fixed in the disk is clearly not inertial. It is possible, however, to 
introduce for each of the infinitesimal elements of the disk an inertial system in which this 
element would be at rest at the moment; for different elements of the disk, having different 
velocities, these systems will, of course, also be different. Let us consider a series of line 
elements, lying along a particular radius vector. Because of the rigidity of the disk, the 
length of each of these segments (in the corresponding inertial system of reference) will be 
the same as it was when the disk was at rest. This same length would be measured by an 
observer at rest, past whom this radius swings at the given moment, since each of its 
segments is perpendicular to its velocity and consequently a Lorentz contraction does not 
occur. Therefore the total length of the radius as measured by the observer at rest, being the 
46 
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sum of its segments, will be the same as when the disk was at rest. On the other hand, the 
length of each element of the circumference of the disk, passing by the observer at rest at a 
given moment, undergoes a Lorentz contraction, so that the length of the whole circumference 
(measured by the observer at rest as the sum of the lengths of its various segments) turns out 
to be smaller than the length of the circumference of the disk at rest. Thus we arrive at the 
result that due to the rotation of the disk, the ratio of circumference to radius (as measured 
by an observer at rest) must change, and not remain equal to 2n. The absurdity of this result 
shows that actually the disk cannot be rigid, and that in rotation it must necessarily undergo 
some complex deformation depending on the elastic properties of the material of the disk. 

The impossibility of the existence of rigid bodies can be demonstrated in another way. 
Suppose some solid body is set in motion by an external force acting at one of its points. If 
the body were rigid, all of its points would have to be set in motion at the same time as the 
point to which the force is applied; if this were not so the body would be deformed. 
However, the theory of relativity makes this impossible, since the force at the particular 
point is transmitted to the others with a finite velocity, so that all the points cannot begin 
moving simultaneously. 

From this discussion we can draw certain conclusions concerning the treatment of 
“elementary” particles, i.e. particles whose state we assume to be described completely by 
giving its three coordinates and the three components of its velocity as a whole. It is obvious 
that if an elementary particle had finite dimensions, i.e. if it were extended in space, it could 
not be deformable, since the concept of deformability is related to the possibility of independent 
motion of individual parts of the body. But, as we have seen, the theory of relativity shows 
that it is impossible for absolutely rigid bodies to exist. 

Thus we come to the conclusion that in classical (non-quantum) relativistic mechanics, we 
cannot ascribe finite dimensions to particles which we regard as elementary. In other words, 
within the framework of classical theory elementary particles must be treated as points.! 


§ 16. Four-potential of a field 

For a particle moving in a given electromagnetic field, the action is made up of two parts: 
the action (8.1) for the free particle, and a term describing the interaction of the particle with 
the field. The latter term must contain quantities characterizing the particle and quantities 
characterizing the field. 

It turns outi that the properties of a particle with respect to interaction with the electro¬ 
magnetic field are determined by a single parameter—the charge e of the particle, which can 
be either positive or negative (or equal to zero). The properties of the field are characterized 
by a four-vector A„ the four-potential, whose components are functions of the coordinates 
and time. These quantities appear in the action function in the term 


t Quantum mechanics makes a fundamental change in this situation, but here again relativity theory 
makes it extremely difficult to introduce anything other than point interactions. 

% The assertions which follow should be regarded as being, to a certain extent, the consequence of 
experimental data. The form of the action for a particle in an electromagnetic field cannot be fixed on the 
basis of general considerations alone (such as, for example, the requirement of relativistic invariance). The 
latter would permit the occurrence in formula (16.1) of terms of the form I Ads, where A is a scalar function. 

To avoid any misunderstanding, we repeat that we are considering classical (and not quantum) theory, 
and therefore do not include effects which are related to the spins of particles. 
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where the functions A, are taken at points on the world line of the particle. The factor 1/c has 
been introduced for convenience. It should be pointed out that, so long as we have no 
formulas relating the charge or the potentials with already known quantities, the units for 
measuring these new quantities can be chosen arbitrarily.! 

Thus the action function for a charge in an electromagnetic field has the form 

b 

S = J ^-mcds -^A t dx' j. (16.1) 

The three space components of the four-vector A* form a three-dimensional vector A 
called the vector potential of the field. The time component is called the scalar potential; we 
denote it by A 0 = <p. Thus 

A' = (<j). A). (16.2) 

Therefore the action integral can be written in the form 

b 

S = J ^-mcds + ^ A • dr - e<pdt\ 


Introducing dr/dt = v, and changing to an integration over t. 


<2 r 


The integrand is just the Lagrangian for a charge in an electromagnetic field: 

L= me 2 Jl - -p- + ^ A ■ v - e<j). (16.4) 

This function differs from the Lagrangian for a free particle (8.2) by the terms (etc) A • v - 
e<j), which describe the interaction of the charge with the field. 

The derivative dL/dv is the generalized momentum of the particle; we denote it by P. 
Carrying out the differentiation, we find 


P = 



(16.5) 


Here we have denoted by p the ordinary momentum of the particle, which we shall refer to 
simply as its momentum. 

From the Lagrangian we can find the Hamiltonian function for a particle in a field from 
the general formula 


t Concerning the establishment of these units, see § 27. 
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dL 

d\ 


Substituting (16.4), we get 



(16.6) 


However, the Hamiltonian must be expressed not in terms of the velocity, but rather in terms 
of the generalized momentum of the particle. 

From (16.5) and (16.6) it is clear that the relation between JT- e<p and P - (e/c )A is the 
same as the relation between -'A and p in the absence of the field, i.e. 


r- e(j> Y _ n 




(16.7) 



(16.8) 


For low velocities, i.e. for classical mechanics, the Lagrangian (16.4) goes over into 


L = 


mv 2 


— A • v - ed). 
c 


(16.9) 


In this approximation 

p = mv = P - - A, 

F c 

and we find the following expression for the Hamiltonian: 


St = P-—a! + e<j). (16.10) 

2 m\ c ) 

Finally we write the Hamilton—Jacobi equation for a particle in an electromagnetic field. 
It is obtained by replacing, in the equation for the Hamiltonian, P by dS/d r, and 
by -( dS/dt ). Thus we get from (16.7) 

+ e *)\mV=0. (16.11) 


§ 17. Equations of motion of a charge in a field 

A charge located in a field not only is subjected to a force exerted by the field, but also in 
turn acts on the field, changing it. However, if the charge e is not large, the action of the 
charge on the field can be neglected. In this case, when considering the motion of the charge 
in a given field, we may assume that the field itself does not depend on the coordinates or 
the velocity of the charge. The precise conditions which the charge must fulfil in order to be 
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considered as small in the present sense, will be clarified later on (see § 75). In what follows 
we shall assume that this condition is fulfilled. 

So we must find the equations of motion of a charge in a given electromagnetic field. 
These equations are obtained by varying the action, i.e. they are given by the Lagrange 
equations 


d_(dL\_dL L 
dt ^ d\ ) dr' 


(17.1) 


where L is given by formula (16.4). 

The derivative dL/d\ is the generalized momentum of the particle (16.5). Further, we 
write 




dA-v-c grad 0. 


But from a formula of vector analysis. 

grad (a • b) = (a • V)b + (b • V)a + b x curl a + a x curl b, 

where a and b are two arbitrary vectors. Applying this formula to A • v, and remembering 
that differentiation with respect to r is carried out for constant v, we find 


—- = -(v • V) A + — v 
dr c c 


curl A — e grad 0. 


So the Lagrange equation has the form: 

^ A j = ^(v • V)A + ^ v x curl A - e grad 0. 

But the total differential (dA/dt) dt consists of two parts: the change (dA/dt) dt of the vector 
potential with time at a fixed point in space, and the change due to motion from one point 
in space to another at distance dr. This second part is equal to (dr ■ V)A. Thus 


dA 

dt 


+ (v • V) A. 


Substituting this in the previous equation, we find 


dp _ e dA 
dt ~ c dt 


grad 0 + ^ v x curl A. 


(17.2) 


This is the equation of motion of a particle in an electromagnetic field. On the left side 
stands the derivative of the particle’s momentum with respect to the time. Therefore the 
expression on the right of (17.2) is the force exerted on the charge in an electromagnetic 
field. We see that this force consists of two parts. The first part (first and second terms on 
the right side of 17.2) does not depend on the velocity of the particle. The second part (third 
term) depends on the velocity, being proportional to the velocity and perpendicular to it. 

The force of the first type, per unit charge, is called the electric field intensity; we denote 
it by E. So by definition. 
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E = - - grad 0. (17.3) 

The factor of v/c in the force of the second type, per unit charge, is called the magnetic 
field intensity. We designate it by H. So by definition, 

H = curl A. (17.4) 

If in an electromagnetic field, E * 0 but H = 0, then we speak of an electric field; if E = 
0 but H 5 t 0, then the field is said to be magnetic. In general, the electromagnetic field is a 
superposition of electric and magnetic fields. 

We note that E is a polar vector while H is an axial vector. 

The equation of motion of a charge in an electromagnetic field can now be written as 

^ = eE + — v x H. (17.5) 

dt c 

The expression on the right is called the Lorentz force. The first term (the force which the 
electric field exerts on the charge) does not depend on the velocity of the charge, and is 
along the direction of E. The second part (the force exerted by the magnetic field on the 
charge) is proportional to the velocity of the charge and is directed perpendicular to the 
velocity and to the magnetic field H. 

For velocities small compared with the velocity of light, the momentum p is approximately 
equal to its classical expression mv, and the equation of motion (17.5) becomes 

n&- = eE + Z v x H , (17.6) 

dt c 

Next we derive the equation for the rate of change of the kinetic energy of the particlet 
with time, i.e. the derivative 



It is easy to check that 

d #idn dp 

dt df 

Substituting dp/dt from (17.5) and noting that v x H • v = 0, we have 

^SS. = eE - v. (17.7) 

at 

The rate of change of the kinetic energy is the work done by the field on the particle per 
unit time. From (17.7) we see that this work is equal to the product of the velocity by the 
force which the electric field exerts on the charge. The work done by the field during a time 
dt, i.e. during a displacement of the charge by dr, is clearly equal to eE ■ dr. 


t By “kinetic” we mean the energy (9.4), which includes the rest energy. 
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We emphasize the fact that work is done on the charge only by the electric field; the 
magnetic field does no work on a charge moving in it. This is connected with the fact that 
the force which the magnetic field exerts on a charge is always perpendicular to the velocity 
of the charge. 

The equations of mechanics are invariant with respect to a change in sign of the time, that 
is, with respect to interchange of future and past. In other words, in mechanics the two time 
directions are equivalent. This means that if a certain motion is possible according to the 
equations of mechanics, then the reverse motion is also possible, in which the system passes 
through the same states in reverse order. 

It is easy to see that this is also valid for the electromagnetic field in the theory of 
relativity. In this case, however, in addition to changing t into - t, we must reverse the sign 
of the magnetic field. In fact it is easy to see that the equations of motion (17.5) are not 
altered if we make the changes 

E -> E, H —» - H. (17.8) 

According to (17.3) and (17.4), this does not change the scalar potential, while the vector 
potential changes sign: 

0->0, A —» - A. (17.9) 

Thus, if a certain motion is possible in an electromagnetic field, then the reversed motion 
is possible in a field in which the direction of H is reversed. 


PROBLEM 

Express the acceleration of a particle in terms of its velocity and the electric and magnetic field intensities. 
Solution: Substitute in the equation of motion (17.5) p = v d kin /c 2 , and take the expression for dd k Jdt 
from (17.7). As a result, we get 


xH--Lv(vE). 


§ 18. Gauge invariance 

Let us consider to what extent the potentials are uniquely determined. First of all we call 
attention to the fact that the field is characterized by the effect which it produces on the 
motion of a charge located in it. But in the equation of motion (17.5) there appear not the 
potentials, but the field intensities E and H. Therefore two fields are physically identical if 
they are characterized by the same vectors E and H. 

If we are given potentials A and <p, then these uniquely determine (according to (17.3) and 
(17.4)) the fields E and H. However, to one and the same field there can correspond different 
potentials. To show this, let us add to each component of the potential the quantity - dfld.'x k , 
where/is an arbitrary function of the coordinates and the time. Then the potential A k goes 
over into 


K = A k 


dj_ 

dx k ' 


(18.1) 


As a result of this change there appears in the action integral (16.1) the additional term 
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which is a total differential and has no effect on the equations of motion. (See Mechanics, 

§ 2 -) - , , • , 

If in place of the four-potential we introduce the scalar and vector potentials, and in place 

of xf, the coordinates ct, x, y, z, then the four equations (18.1) can be written in the form 

A' = A + grad/, 0'= 0--^^-. (18.3) 

It is easy to check that electric and magnetic fields determined from equations (17.3) and 
(17.4) actually do not change upon replacement of A and 0 by A' and 0', defined by (18.3). 
Thus the transformation of potentials (18.1) does not change the fields. The potentials are 
therefore not uniquely defined; the vector potential is determined to within the gradient of 
an arbitrary function, and the scalar potential to within the time derivative of the same 
function. 

In particular, we see that we can add an arbitrary constant vector to the vector potential, 
and an arbitrary constant to the scalar potential. This is also clear directly from the fact that 
the definitions of E and H contain only derivatives of A and 0, and therefore the addition of 
constants to the latter does not affect the field intensities. 

Only those quantities have physical meaning which are invariant with respect to the 
transformation (18.3) of the potentials; in particular all equations must be invariant under 
this transformation. This invariance is called gauge invariance (in German, eichinvarianz)A 
This nonuniqueness of the potentials gives us the possibility of choosing them so that they 
fulfil one auxiliary condition chosen by us. We emphasize that we can set one condition, 
since we may choose the function/in (18.3) arbitrarily. In particular, it is always possible 
to choose the potentials so that the scalar potential 0 is zero. If the vector potential is not 
zero, then it is not generally possible to make it zero, since the condition A = 0 represents 
three auxiliary conditions (for the three components of A). 

§ 19. Constant electromagnetic field 

By a constant electromagnetic field we mean a field which does not depend on the time. 
Clearly the potentials of a constant field can be chosen so that they are functions only of the 
coordinates and not of the time. A constant magnetic field is equal, as before, to H = curl A. 
A constant electric field is equal to 

E = - grad 0. (19.1) 

Thus a constant electric field is determined only by the scalar potential and a constant 
magnetic field only by the vector potential. 

We saw in the preceding section that the potentials are not uniquely determined. However, 
it is easy to convince oneself that if we describe the constant electromagnetic field in terms 
of potentials which do not depend on the time, then we can add to the scalar potential, 
without changing the fields, only an arbitrary constant (not depending on either the coordinates 

t We emphasize that this is related to the assumed constancy of e in (18.2). Thus the gauge invariance 
of the equations of electrodynamics (see below) and the conservation of charge are closely related to one 
another. 
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or the time). Usually Q is subjected to the additional requirement that it has a definite value 
at some particular point in space; most frequently <j) is chosen to be zero at infinity. Thus the 
arbitrary constant previously mentioned is determined, and the scalar potential of the constant 
field is thus determined uniquely. 

On the other hand, just as before, the vector potential is not uniquely determined even for 
the constant electromagnetic field; namely, we can add to it the gradient of an arbitrary 
function of the coordinates. 

We now determine the energy of a charge in a constant electromagnetic field. If the field 
is constant, then the Lagrangian for the charge also does not depend explicitly on the time. 
As we know, in this case the energy is conserved and coincides with the Hamiltonian. 
According to (16.6), we have 



Thus the presence of the field adds to the energy of the particle the term e<j), the potential 
energy of the charge in the field. We note the important fact that the energy depends only on 
the scalar and not on the vector potential. This means that the magnetic field does not affect 
the energy of the charge. Only the electric field can change the energy of the particle. 
This is related to the fact that the magnetic field, unlike the electric field, does no work on 
the charge. 

If the field intensities are the same at all points in space, then the field is said to be 
uniform. The scalar potential of a uniform electric field can be expressed in terms of the 
field intensity as 

0 = - E - r. (19.3) 

In fact, since E = const, V(E • r) = (E • V) r = E. 

The vector potential of a uniform magnetic field can be expressed in terms of its field 
intensity as 

A = yH x r. (19.4) 

In fact, recalling that H = const, we obtain with the aid of well-known formulas of vector 
analysis: 

curl (H x r) = H div r - (H • V)r = 2H 

(noting that div r = 3). 

The vector potential of a uniform magnetic field can also be chosen in the form 

A x = - Hy, A y = A z = 0 (19.5) 

(the z axis is along the direction of H). It is easily verified that with this choice for A we have 
H = curl A. In accordance with the transformation formulas (18.3), the potentials (19.4) and 
(19.5) differ from one another by the gradient of some function: formula (19.5) is obtained 
from (19.4) by adding V/, where/ = -xyH/2. 


PROBLEM 

Give the variational principle for the trajectory of a particle (Maupertuis’ principle) in a constant 
electromagnetic field in relativistic mechanics. 
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Solution: Maupertuis’ principle consists in the statement that if the energy of a particle is conserved 
(motion in a constant field), then its trajectory can be determined from the variational equation 


S J P dr = 0, 

where P is the generalized momentum of the particle, expressed in terms of the energy and the coordinate 
differentials, and the integral is taken along the trajectory of the particle, t Substituting P = p + (e/c )A and 
noting that the directions of p and dr coincide, we have 

[p dl + f A dr ] =0 ’ 

where dl = -[dr 2 is the element of arc. Determining p from 


we obtain finally 



§ 20. Motion in a constant uniform electric field 

Let us consider the motion of a charge e in a uniform constant electric field E. We take the 
direction of the field as the X axis. The motion will obviously proceed in a plane, which we 
choose as the XT plane. Then the equations of motion (17.5) become 
p x = eE, p y = 0 

(where the dot denotes differentiation with respect to /), so that 

p x = eEt, Py-Po- (20.1) 

The time reference point has been chosen at the moment when p x = 0; p 0 is the momentum 
of the particle at that moment. 

The kinetic energy of the particle (the energy omitting the potential energy in the field) is 
= Cyjm 2 c 2 + p 2 . Substituting (20.1), we find in our case 

^kin = V-V + c 2 p 2 + ( ceEt ) 2 = Jifl + ( ceEt ) 2 , (20.2) 

where d () is the energy at t = 0. 

According to (9.8) the velocity of the particle is v = pc 2 /^ kin . For the velocity v x - x v/e 
have therefore 

dx _ p x c 2 _ c 2 eEt 
~dt~ ^ldn " ^ 2 + (ceEt f 


Integrating, we find 


t See Mechanics, § 44. 
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(20.3) 


x = JE (ceEt) 2 . 

The constant of integration we set equal to zero.f 
For determining y, we have 


from which 


_ P y c 2 
* <^kin 


PqC 2 


y = 



(20.4) 


We obtain the equation of the trajectory by expressing t in terms of y from (20.4) and 
substituting in (20.3). This gives: 


<f 0 eE 

x = — cosh —-. 
eE p 0 c 


(20.5) 


Thus in a uniform electric field a charge moves along a catenary curve. 

If the velocity of the particle is v « c, then we can set p 0 = mv 0 , 0 = me 2 , and expand 
(20.5) in series in powers of He. Then we get, to within terms of higher order. 


x = 


eE 2 

2^P 


+ const, 


that is, the charge moves along a parabola, a result well known from classical mechanics. 


§ 21. Motion in a constant uniform magnetic field 

We now consider the motion of a charge e in a uniform magnetic field H. We choose the 
direction of the field as the Z axis. We rewrite the equation of motion 

p = — v x H 
c 

in another form, by substituting for the momentum, from (9.8), 



where ^is the energy of the particle, which is constant in the magnetic field. The equation 
of motion then goes over into the form 


or, expressed in terms of components, 
v x - (ov y , 

t This result (forp 0 = 0) coincides with the solution of the problem of relativistic motion with constant 
“proper acceleration" w 0 = eE/m (see the problem in § 7). For the present case, the constancy of the 
acceleration is related to the fact that the electric field does not change for Lorentz transformations having 
velocities V along the direction of the field (see § 24). 


( 41 . 1 ) 


Vy = -cov x , i/ = 0, 


( 21 . 2 ) 
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where we have introduced the notation 


_ ecH 

V ' 


We multiply the second equation of (21.2) by i, and add it to the first: 

4~{v x + iv.) = - ia){v x + iv y ), 
at 


(21.3) 


so that 


v x + iv y = ae l0 *, 

where a is a complex constant. This can be written in the form a = v 0r e where v 0t and a 
are real. Then 

u, + iv y = v 0l e- i<6 * +a) 

and, separating real and imaginary parts, we find 

v x = Vq, cos ( cat + a), v y = — Vq, sin {ax + a). (21.4) 

The constants v 0l , and a are determined by the initial conditions; a is the initial phase, and 
as for v (l „ from (21.4) it is clear that 

v ot = + 

that is, v ot , is the velocity of the particle in the XY plane, and stays constant throughout the 
motion. 

From (21.4) we find, integrating once more, 

x = x 0 + r sin {ax + a), y = yo +r cos + «)> (21.5) 


where 


_ v ot _ v ot & c Pt 

r ~ (o - ecH eH 


( 21 . 6 ) 


{p, is the projection of the momentum on the XY plane). From the third equation of (21.2), 
we find v, = v 0z and 


z = z 0 + v 0z t. 


(21.7) 


From (21.5) and (21.7), it is clear that the charge moves in a uniform magnetic field along 
a helix having its axis along the direction of the magnetic field and with a radius r given by 
(21.6). The velocity of the particle is constant. In the special case where v 0z = 0, that is, the 
charge has no velocity component along the field, it moves along a circle in the plane 
perpendicular to the field. 

The quantity <0, as we see from the formulas, is the angular frequency of rotation of the 
particle in the plane perpendicular to the field. 2 

If the velocity of the particle is low, then we can approximately set if=mc . Then the 
frequency 0) is changed to 


( 21 . 8 ) 
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We shall now assume that the magnetic field remains uniform but varies slowly in magnitude 
and direction. Let us see how the motion of a charged particle changes in this case. 

We know that when the conditions of the motion are changed slowly, certain quantities 
called adiabatic invariants remain constant. Since the motion in the plane perpendicular to 
the magnetic field is periodic, the adiabatic invariant is the integral 


1 =itj p ' dr - 

taken over a complete period of the motion, i.e. over the circumference of a circle in the 
present case (P, is the projection of the generalized momentum on the plane perpendicular 
to Hf). Substituting P, = p, + (e/c) A, we have: 


i = sf p '' rfr = 5?f p ''* + S?f A '*- 
In the first term we note that p, is constant in magnitude and directed along dr; we apply 
Stokes’ theorem to the second term and write curl A = H:* 


(21.9) 


From this we see that, for slow variation of H, the tangential momentum p t varies proportionally 
to 4h. 

This result can also be applied to another case, when the particle moves along a helical 
path in a magnetic field that is not strictly homogeneous (so that the field varies little over 
distances comparable with the radius and step of the helix). Such a motion can be considered 
as a motion in a circular orbit that shifts in the course of time, while relative to the orbit the 
field appears to change in time but remain uniform. One can then state that the component 

of the momentum transverse to the direction of the field varies according to the law: p t = \ C H, 

where C is a constant and H is a given function of the coordinates. On the other hand, just 
as for the motion in any constant magnetic field, the energy of the particle (and consequently 
the square of its momentum p 2 ) remains constant. Therefore the longitudinal component of 
the momentum varies according to the formula: 


Pf =P 2 ~p} =P 2 - CH(x,y,z). (21.10) 

Since we should always have p 2 > 0, we see that penetration of the particle into regions 
of sufficiently high field (CH > p 2 ) is impossible. During motion in the direction of increasing 
field, the radius of the helical trajectory decreases proportionally to pJH (i.e. proportionally 


t See Mechanics, § 49. In general the integrals j p dq , taken over a period of the particular coordinate 
q, are adiabatic invariants. In the present case the periods for the two coordinates in the plane perpendicular 
to H coincide, and the integral / which we have written is the sum of the two corresponding adiabatic in¬ 
variants. However, each of these invariants individually has no special significance, since it depends on the 
(non-unique) choice of the vector potential of the field. The nonuniqueness of the adiabatic invariants 
which results from this is a reflection of the fact that, when we regard the magnetic field as uniform over 
all of space, we cannot in principle determine the electric field which results from changes in H, since it will 
actually depend on the specific conditions at infinity. 

*By inspecting the direction of motion of a charge along the orbit for a given direction of H, we observe 
that it is counterclockwise if we look along H. Hence the negative sign in the second term. 
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to 1/VN), and the step proportionally to p b On reaching the boundary where pi vanishes, 
the particle is reflected; while continuing to rotate in the same direction it begins to move 
opposite to the gradient of the field. 

Inhomogeneity of the field also leads to another phenomenon—a slow transverse shift 
(drift) of the guiding centre of the helical trajectory of the particle (the name given to the 
centre of the circular orbit); problem 3 of the next section deals with this question. 


PROBLEM 

Determine the frequency of vibration of a charged spatial oscillator, placed in a constant, uniform 
magnetic field; the proper frequency of vibration of the oscillator (in the absence of the field) is ftfc. 
Solution: The equations of forced vibration of the oscillator in a magnetic field (directed along the z axis) 


„ eH . .. o eH . .. , _ 

x+ col x = —y, y+eo^y = — x, z + (o 0 z- 0. 

Multiplying the second equation by i and combining with the first, we find 



where £=x+ iy. From this we find that the frequency of vibration of the oscillator in a plane perpendicular 
to the field is 



If the field H is weak, this formula goes over into 

CO = COq ± eHI2mc. 

The vibration along the direction of the field remains unchanged. 


§ 22, Motion of a charge in constant uniform electric and magnetic fields 

Finally we consider the motion of a charge in the case where there are present both electric 
and magnetic fields, constant and uniform. We limit ourselves to the case where the velocity 
of the charge v« c, so that its momentum p = mv; as we shall see later, it is necessary for 
this that the electric field be small compared to the magnetic. 

We choose the direction of H as the Z axis, and the plane passing through H and E as the 
YZ plane. Then the equation of motion 

mv = eE + —v x H 
c 

can be written in the form 

mx = -yH, my = eE v - —xH, mz = eE z . (22.1) 

c J y c 

From the third equation we see that the charge moves with uniform acceleration in the Z 
direction, that is. 


( 22 . 2 ) 
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Multiplying the second equation of (22.1) by i and combining with the first, we find 
+ iy) + io)(x + iy ) = i-^E y 


(< 0 = eHImc). The integral of this equation, where x + iy is considered as the unknown, is 
equal to the sum of the integral of the same equation without the right-hand term and a 
particular integral of the equation with the right-hand term. The first of these is ae~ ,at , the 
second is eE y /m(o = cE y /H. Thus 

cE y 

x + iy = ae + —. 

The constant a is in general complex. Writing it in the form a = be ,x , with real b and a, we 
see that since a is multiplied by e~ icat , we can, by a suitable choice of the time origin, give 
the phase a any arbitrary value. We choose this so that a is real. Then breaking up x + iy 
into real and imaginary parts, we find 

cE y 

x = a cos cot + —pj~, y = - a sin (ot. (22.3) 

At t = 0 the velocity is along the X axis. 

We see that the components of the velocity of the particle are periodic functions of the 
time. Their average values are: 


cE y 

~fT’ 


y = o. 


This average velocity of motion of a charge in crossed electric and magnetic fields is often 
called the electrical drift velocity. Its direction is perpendicular to both fields and independent 
of the sign of the charge. It can be written in vector form as: 


- cE x H 
H 2 ’ 


(22.4) 


All the formulas of this section assume that the velocity of the particle is small compared 
with the velocity of light; we see that for this to be so, it is necessary in particular that the 
electric and magnetic fields satisfy the condition 


(22.5) 


while the absolute magnitudes of E y and H can be arbitrary. 

Integrating equation (22.3) again, and choosing the constant of integration so that at t = 0, 
x = y = 0, we obtain 


x — ~ sin cot + y = ^(cos at - 1). (22.6) 

Considered as parametric equations of a curve, these equations define a trochoid. Depending 
on whether a is larger or smaller in absolute value than the quantity cE y /H, the projection 
of the trajectory on the plane XY has the forms shown in Figs. 6a and 6b, respectively. 

If a = - cE y /H, then (22.6) becomes 




MOTION OF A CHARGE IN CONSTANT UNIFORM ELECTRIC AND MAGNETIC FIELDS 


61 


§ 22 


that is, the projection of the trajectory on the XY plane is a cycloid (Fig. 6c). 


y 



(22.7) 



Fig. 6. 

PROBLEMS 


1. Determine the relativistic motion of a charge in parallel uniform electric and magnetic fields. 
Solution: The magnetic field has no influence on the motion along the common direction of E and H (the 
z axis), which therefore occurs under the influence of the electric field alone; therefore according to § 20 
we find: 


Z = ^§-. = A H + (ceEt ^ ■ 


For the motion in the xy plane we have the equation 

p x =^Hv y , p y =-^Hv x , 


Consequently 


Px+iPy=Pt e ~ i ^ 

where p, is the constant value of the projection of the momentum on the xy plane, and the auxiliary quantity 
<P is defined by the relation 
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from which 

. rt= S sinh f^ (1) 

Furthermore we have: 


+ i Py=Pl e-‘* = '-^L(i + i y) = 


C Pt ■ . cpt 

x = 7H* n *- y = ~eH C ° S ^ 

Formulas (1), (2) together with the formula 

z = ?i cosh #*’ 


( 2 ) 


(3) 


determine the motion of the particle in parametric form. The trajectory is a helix with radius cp,/eH and 
monotonically increasing step, along which the particle moves with decreasing angular velocity 0 = 
eHdtfy^ and with a velocity along the z axis which tends toward the value c. 

2. Determine the relativistic motion of a charge in electric and magnetic fields which are mutually 
perpendicular and equal in magnitude.t 

Solution: Choosing the z axis along H and the y axis along E and setting E = H, we write the equations 
of motion: 


dp* 

dt 


dPy _ 


and, as a consequence of them, formula (17.7), 


From these equations we have: 


Also using the equation 


p z = const, rf kin - c Px = const = a. 


&Un -° 2 pl - kin + CPx W kin ~ CPx) = C* Py + £ 2 


(where e 2 = m 2 c 4 + c 2 p 2 = const), we find: 


and so 


^kin + CPx = ~(c 2 p 2 


£ 2 ), 


^kin 


a 

2 


C 2 Py +£ 2 
2 a 


Px 


ai c 2 P 2 + £ 2 
2c + 2 ac 


t The problem of motion in mutually perpendicular fields E and H which are not equal in magnitude can, 
by a suitable transformation of the reference system, be reduced to the problem of motion in a pure electric 
or a pure magnetic field (see § 25). 
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^km = eE {^ un - ifki * Vx j = eE(tf Vm -cp x ) = eEa, 

from which 

01 

To determine the trajectory, we make a transformation of variables in the equations 


dt ^ 

to the variable p y by using the relation dt = t* km dp y /eEa, after which integration gives the formulas: 


y ~2 cxeE Py ’ z ~ P eEa Py ' 

Formulas (1) and (2) completely determine the motion of the particle in parametric form (parameter p y ). We 
call attention to the fact that the velocity increases most rapidly in the direction perpendicular to E and H 
(the x axis). 

3. Determine the velocity of drift of the guiding centre of the orbit of a nonrelativistic charged particle 
in a quasihomogeneous magnetic field (H. Alfven, 1940). 

Solution: We assume first that the particle is moving in a circular orbit, i.e. its velocity has no longitudinal 
component (along the field). We write the equation of the trajectory in the form r = R(f) + £(t), where R(f) 
is the radius vector of the guiding centre (a slowly varying function of the time), while £(t) is a rapidly 
oscillating quantity describing the rotational motion about the guiding centre. We average the force 
(etc) r x H(r) acting on the particle over a period of the oscillatory (circular) motion (compare Mechanics, 
§ 30). We expand the function H(r) in this expression in powers of 
H(r) = H(R) + (C • V)H(R). 

On averaging, the terms of first order in £(f) vanish, while the second-degree terms give rise to an additional 
force 

f=-^x(£-V)H. 

For a circular orbit 


C=®£xn, 


where n is a unit vector along H; the frequency ( 0 = eHImc; is the velocity of the particle in its circular 
motion. The average values of products of components of the vector £, rotating in a plane (the plane 
perpendicular to n), are: 

where & a p is the unit tensor in this plane. As a result we find: 
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Because of the equations div H = 0 and curl H = 0 which the constant field H(R) satisfies, we have: 

(n x V) x H = -n div H + (n ■ V)H + n x (V x H) = (n • V)H = H( n ■ V)n + n(n ■ VH). 

We are interested in the force transverse to n, giving rise to a shift of the orbit; it is equal to 


2 P ' 


where p is the radius of curvature of the force line of the field at the given point, and v is a unit vector 
directed from the centre of curvature to this point. 

The case where the particle also has a longitudinal velocity vj t (along n) reduces to the previous case if 
we go over to a reference frame which is rotating about the instantaneous centre of curvature of the force 
line (which is the trajectory of the guiding centre) with angular velocity vj/p. In this reference system the 
particle has no longitudinal velocity, but there is an additional transverse force, the centrifugal force 
mifilp. Thus the total transverse force is 



This force is equivalent to a constant electric field of strength f Je. According to (22.4) it causes a drift 
of the guiding center of the orbit with a velocity 



The sign of this velocity depends on the sign of the charge. 


§ 23. The electromagnetic field tensor 

In § 17, we derived the equation of motion of a charge in a field, starting from the 
Lagrangian (16.4) written in three-dimensional form. We now derive the same equation 
directly from the action (16.1) written in four-dimensional notation. 

The principle of least action states 

SS = S J ^-mcds - ^ Aidx' j = 0. (23.1) 

Noting that ds = y Jdx t dx ' , we find (the limits of integration a and b are omitted for 
brevity): 

SS = - J ~ AjdSx ' + ^ SAidx ' j = 0. 

We integrate the first two terms in the integrand by parts. Also, in the first term we set dx ( f 
ds = u h where are the components of the four-velocity. Then 

J | mcduj Sx‘ + 6x‘ dA, - ^ SA, dx‘ j - ^ mcuj + ^Ai^Sx' j = 0. (23.2) 

The second term in this equation is zero, since the integral is varied with fixed coordinate 
values at the limits. Furthermore; 
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J ^mcdutSx 1 + -8x i dx k - dx'5x k j = 0. 

In the first term we write du t = ( dujds ) ds, in the second and third, dx l - u'ds. In addition, 
in the third term we interchange the indices i and k (this changes nothing since the indices 
i and k are summed over). Then 

j[ mc Tk-‘c (f^-0]" 1 ]^*' 0 ' 

In view of the arbitrariness of Sx*, it follows that the integrand is zero, that is. 



We now introduce the notion 


(23-3) 

The antisymmetric tensor F ik is called the electromagnetic field tensor. The equation of 
motion then takes the form: 


du‘ e j?ik 
c-r- = — F k u k . 
ds c 


(23.4) 


These are the equations of motion of a charge in four-dimensional form. 

The meaning of the individual components of the tensor F ik is easily seen by substituting 
the values A, = ((j), - A) in the definition (23.3). The result can be written as a matrix in which 
the index i = 0, 1, 2, 3 labels the rows, and the index k the columns: 


f o 

E x 

E y 

E z' 


' 0 

-E x 

-Ey 

-E z " 

-E x 

0 

-H, 

H y 


E x 

0 

-H z 

Hy 





F ik = 





-E y 

H z 

0 

-H x 


E y 

H z 

0 

-H x 

-e z 

-Hy 

H x 

0 > 


y E z 

-Hy 

H x 

0 > 


More briefly, we can write (see § 6): 

F ik = (E, H), F ik = (-E, H). 

Thus the components of the electric and magnetic field strengths are components of the 
same electromagnetic field four-tensor. 

Changing to three-dimensional notation, it is easy to verify that the three space components 
(i = 1, 2, 3) of (23.4) are identical with the vector equation of motion (17.5), while the time 
component (/' = 0) gives the work equation (17.7). The latter is a consequence of the 
equations of motion; the fact that only three of the four equations are independent can also 
easily be found directly by multiplying both sides of (23.4) by u‘. Then the left side of the 
equation vanishes because of the orthogonality of the four-vectors u l and du t lds, while the 
right side vanishes because of the antisymmetry of F ik . 
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If we admit only possible trajectories when we vary S, the first term in (23.2) vanishes 
identically. Then the second term, in which the upper limit is considered as variable, gives 
the differential of the action as a function of the coordinates. Thus 


Then 


S - - ^ mciij + ~ Aj j<5x'. 


(23.6) 


dS e e 

—77 = mciij + = pi + - Aj. (23.7) 

dx' c c 

The four-vector - SS/chc 1 is the four-vector P, of the generalized momentum of the particle. 
Substituting the values of the components p, and A h we find that 

pi = ^gj g + e< P , p + fA^j. (23.8) 

As expected, the space components of the four-vector form the three-dimensional generalized 
momentum vector (16.5), while the time component is <‘lc, where cfis the total energy of the 
charge in the field. 


§ 24. Lorentz transformation of the field 


In this section we find the transformation formulas for fields, that is, formulas by means 
of which we can determine the field in one inertial system of reference, knowing the same 
field in another system. 

The formulas for transformation of the potentials are obtained directly from the general 
formulas for transformation of four-vectors (6.1). Remembering that A 1 - (0, A), we get 
easily 



The transformation formulas for an antisymmetric second-rank tensor (like F ik ) were 
found in problem 2 of § 6: the components P 23 and P 01 do not change, while the components 
P 02 , P 03 , and P 12 , P 13 transform like x° and x 1 , respectively. Expressing the components of 
F ,k in terms of the components of the fields E and H, according to (23.5), we then find the 
following formulas of transformation for the electric field: 


E x 




(24.2) 


and for the magnetic field: 
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Thus the electric and magnetic fields, like the majority of physical quantities, are relative; 
that is, their properties are different in different reference systems. In particular, the electric 
or the magnetic field can be equal to zero in one reference system and at the same time be 
present in another system. 

The formulas (24.2), (24.3) simplify considerably for the case V « c. To terms of order 
V7c, we have: 


E x =E' X ,E, =E' + - HUE, = E' - ~H ' y ; 

jt * y y c ' z c 

H X =H' X , H v =H’~ - EUH, = H: + -E' y . 

These formulas can be written in vector form 

E = E' + — H' x V, H = H' - — E' x V. (24.4) 

c c 

The formulas for the inverse transformation from K' to K are obtained directly from 
(24.2)-(24.4) by changing the sign of V and shifting the prime. 

If the magnetic field H' = 0 in the K’ system, then, as we easily verify on the basis of (24.2) 
and (24.3), the following relation exists between the electric and magnetic fields in the K 
system: 

H = - V x E. (24.5) 

c 

If in the K! system, E' = 0, then in the K system 

E = - -V x H. (24.6) 

C 

Consequently, in both cases, in the K system the magnetic and electric fields are mutually 
perpendicular. 

These formulas also have a significance when used in the reverse direction: if the fields 
E and H are mutually perpendicular (but not equal in magnitude) in some reference system 
K, then there exists a reference system K' in which the field is pure electric or pure magnetic. 
The velocity V of this system (relative to K) is perpendicular to E and H and equal in 
magnitude to cHIE in the first case (where we must have H <E) and to cEIH in the second 
case (where E < H). 

§ 25. Invariants of the field 

From the electric and magnetic field intensities we can form invariant quantities, which 
remain unchanged in the transition from one inertial reference system to another. 

The form of these invariants is easily found starting from the four-dimensional representation 
of the field using the antisymmetric four-tensor F ,k . It is obvious that we can form the 
following invariant quantities from the components of this tensor: 

F ik F* = inv, (25.1) 

e ik,m F ik F lm = inv, (25.2) 

where e iklm is the completely antisymmetric unit tensor of the fourth rank (cf. § 6). The first 
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quantity is a scalar, while the second is a pseudoscalar (the product of the tensor F ik with its 
dual tensor, t 

Expressing F ik in terms of the components of E and H using (23.5), it is easily shown that, 
in three-dimensional form, these invariants have the form: 

H 2 -E 2 = inv, (25.3) 

E • H = inv. (25.4) 

The pseudoscalar character of the second of these is here apparent from the fact that it is the 
product of the polar vector E with the axial vector H (whereas its square (E • H) 2 is a true 
scalar). 

From the invariance of the two expressions presented, we get the following theorems. If 
the electric and magnetic fields are mutually perpendicular in any reference system, that is, 
E • H = 0, then they are also perpendicular in every other inertial reference system. If the 
absolute values of E and H are equal to each other in any reference system, then they are the 
same in any other system. 

The following inequalities are also clearly valid. If in any reference system E > H (or H 
> E), then in every other system we will have E > H (or H > E). If in any system of reference 
the vectors E and H make an acute (or obtuse) angle, then they will make an acute (or 
obtuse) angle in every other reference system. 

By means of a Lorentz transformation we can always give E and H any arbitrary values, 
subject only to the condition that E 2 - H 2 and E • H have fixed values. In particular, we can 
always find an inertial system in which the electric and magnetic fields are parallel to each 
other at a given point. In this system E • H = EH, and from the two equations 

E 2 -H 2 =E 2 -Hi, EH = E 0 -H 0 . 

we can find the values of E and H in this system of reference (E 0 and H 0 are the electric and 
magnetic fields in the original system of reference). 

The case where both invariants are zero is excluded. In this case, E and H are equal and 
mutually perpendicular in all reference systems. 

If E - H = 0, then we can always find a reference system in which E = 0 or H = 0 
(according as E 2 - H 2 < or > 0), that is, the field is purely magnetic or purely electric. 
Conversely, if in any reference system E = 0 or H = 0, then they are mutually perpendicular 
in every other system, in accordance with the statement at the end of the preceding section. 

We shall give still another approach to the problem of finding the invariants of an 
antisymmetric four-tensor. From this method we shall, in particular, see that (25.3)-(25.4) 
are actually the only two independent invariants and at the same time we will explain some 
instructive mathematical properties of the Lorentz transformations when applied to such a 
four-tensor. 

Let us consider the complex vector 

F = E + iH. (25.5) 

t We also note that the pseudoscalar (25.2) can also be expressed as a four-divergence: 



as can be easily verified by using the antisymmetry of e Mm . 
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Using formulas (24.2)-(24.3), it is easy to see that a Lorentz transformation (along the x 
axis) for this vector has the form 

F x = F', F y - F' cosh 0 - iF' sinh 0 = F' cos i(j) - F' sin i(p. 

F z = Ft cos i<j> + Ft sin i<p, tanh 0 = (25.6) 

We see that a rotation in the x, t plane in four-space (which is what this Lorentz transformation 
is) for the vector F is equivalent to a rotation in the y, z plane through an imaginary angle 
in three-dimensional space. The set of all possible rotations in four-space (including also the 
simple rotations around the x, y, and z, axes) is equivalent to the set of all possible rotations, 
through complex angles in three-dimensional space (where the six angles of rotation in four- 
space correspond to the three complex angles of rotation of the three-dimensional system). 

The only invariant of a vector with respect to rotation is its square: F 2 - E 2 - H 2 + 
2i E ■ H; thus the real quantities E 2 - FI 2 and E • H are the only two independent invariants 
of the tensor F ik . 

If F 2 E 0, the vector F can be written as F = an, where n is a complex unit vector (n 2 = 
1). By a suitable complex rotation we can point n along one of the coordinate axes; it is clear 
that then n becomes real and determines the directions of the two vectors E and H : F = (E 
+ IFF) n; in other words we get the result that E and H become parallel to one another. 


PROBLEM 

Determine the velocity of the system of reference in which the electric and magnetic fields are parallel. 

Solution: Systems of reference K\ satisfying the required condition, exist in infinite numbers. If we have 
found one such, then the same property will be had by any other system moving relative to the first with 
its velocity directed along the common direction of E and H. Therefore it is sufficient to find one of these 
systems which has a velocity perpendicular to both fields. Choosing the direction of the velocity as the x 
axis, and making use of the fact that in K'\ E' = H', = 0, E'H' - E’ z H’ y = 0, we obtain with the aid of 
formulas (24.2) and (24.3) for the velocity V of the K' system relative to the original system the following 
equation: 

y 

__c__ _ _E_x_H_ 

1 + Vi E 2 +H 2 
c 2 

(we must choose that root of the quadratic equation for which V < c). 
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§ 26. The first pair of Maxwell’s equations 
From the expressions 

H = curl A, E = - - grad <p 

it is easy to obtain equations containing only E and H. To do this we find curl E: 

curl E = - ~ curl A - curl grad tp. 

But the curl of any gradient is zero. Consequently, 

curi E = - (26.1) 

Taking the divergence of both sides of the equation curl A = H, and recalling that div curl 
= 0, we find 

div H = 0. (26.2) 

The equations (26.1) and (26.2) are called the first pair of Maxwell’s equations.-)- We note 
that these two equations still do not completely determine the properties of the fields. This 
is clear from the fact that they determine the change of the magnetic field with time (the 
derivative dHIdt), but do not determine the derivative dE/dt. 

Equations (26.1) and (26.2) can be written in integral form. According to Gauss’theorem 

| div H dV = (j) H • df , 

where the integral on the right goes over the entire closed surface surrounding the volume 
over which the integral on the left is extended. On the basis of (26.2), we have 

| H • df = 0. (26.3) 


t Maxwell’s equations (the fundamental equations of electrodynamics) were first formulated by him in 
the 1860’s. 
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The integral of a vector over a surface is called the flux of the vector through the surface. 
Thus the flux of the magnetic field through every closed surface is zero. 

According to Stokes’ theorem. 


J curl E • df = j) E • dl, 

where the integral on the right is taken over the closed contour bounding the surface over 
which the left side is integrated. From (26.1) we find, integrating both sides for any surface. 


| E . dl = H • df. (26.4) 

The integral of a vector over a closed contour is called the circulation of the vector around 
the contour. The circulation of the electric field is also called the electromotive force in the 
given contour. Thus the electromotive force in any contour is equal to minus the time 
derivative of the magnetic flux through a surface bounded by this contour. 

The Maxwell equations (26.1) and (26.2) can be expressed in four-dimensional notation. 
Using the definition of the electromagnetic field tensor 
F ik = c)A k !dx l - DAJdxK 


it is easy to verify that 


dF ik dF k , dF u 
dx' dx‘ dx k 


= 0 . 


(26.5) 


The expression on the left is a tensor of third rank, which is antisymmetric in all three 
indices. The only components which are not identically zero are those with i*k*l. Thus 
there are altogether four different equations which we can easily show [by substituting from 
(23.5)] coincide with equations (26.1) and (26.2). 

We can construct the four-vector which is dual to this antisymmetric four-tensor of rank 
three by multiplying the tensor by e iklm and contracting on three pairs of indices (see § 6). 
Thus (26.5) can be written in the form 


= 0 , (26.6) 

dx k 

which shows explicitly that there are only four independent equations. 


§ 27. The action function of the electromagnetic field 

The action function 5 for the whole system, consisting of an electromagnetic field as well 
as the particles located in it, must consist of three parts: 

S = S f + S m + S mf , (27.1) 

where S m is that part of the action which depends only on the properties of the particles, that 
is, just the action for free particles. For a single free particle, it is given by (8.1). If there are 
several particles, then their total action is the sum of the actions for each of the individual 
particles. Thus, 


S m = -Imc I ds. 


(27.2) 
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The quantity S mf is that part of the action which depends on the interaction between the 
particles and the field. According to § 16, we have for a system of particles: 

s >»f = ~ ^ j A k dx k . (27.3) 

In each term of this sum, A k is the potential of the field at that point of specetime at which 
the corresponding particle is located. The sum S m + S m/ is already familiar to us as the action 
(16.1) for charges in a field. 

Finally 5/is that part of the action which depends only on the properties of the field itself, 
that is, S f is the action for a field in the absence of charges. Up to now, because we were 
interested only in the motion of charges in a given electromagnetic field, the quantity S f , 
which does not depend on the particles, did not concern us, since this term cannot affect the 
motion of the particles. Nevertheless this term is necessary when we want to find equations 
determining the field itself. This corresponds to the fact that from the parts S m + S m/ of the 
action we found only two equations for the field, (26.1) and (26.2), which are not yet 
sufficient for complete determination of the field. 

To establish the form of the action S f for the field, we start from the following very 
important property of electromagnetic fields. As experiment shows, the electromagnetic 
field satisfies the so-called principle of superposition. This principle consists in the statement 
that the field produced by a system of charges is the result of a simple composition of the 
fields produced by each of the particles individually. This means that the resultant field 
intensity at each point is equal to the vector sum of the individual field intensities at that 
point. 

Every solution of the field equations gives a field that can exist in nature. According to the 
principle of superposition, the sum of any such fields must be a field that can exist in nature, 
that is, must satisfy the field equations. 

As is well known, linear differential equations have just this property, that the sum of any 
solutions is also a solution. Consequently the field equations must be linear differential 
equations. 

From the discussion, it follows that under the integral sign for the action S f there must 
stand an expression quadratic in the field. Only in this case will the field equations be linear; 
the field equations are obtained by varying the action, and in the variation the degree of the 
expression under the integral sign decreases by unity. 

The potentials cannot enter into the expression for the action S f , since they are not uniquely 
determined (in S mf this lack of uniqueness was not important). Therefore S f must be the 
integral of some function of the electromagnetic field tensor F ik . But the action must be a 
scalar and must therefore be the integral of some scalar. The only such quantity is the 
product F ik F lk . + 


t The function in the integrand of S f must not include derivatives of F ih since the Lagrangian can contain 
aside from the coordinates, only their first time derivatives. The role of “coordinates” (i.e., parameters to 
be varied in the principle of least action) is in this case played by the field potential A k \ this is analogous 
to the situation in mechanics where the Lagrangian of a mechanical system contains only the coordinates 
of the particles and their first time derivatives. 

As for the quantity e' klm F ik F lm (§ 25), as pointed out in the footnote on p. 68, it is a complete four- 
divergence, so that adding it to the integrand in S f would have no effect on the “equations of motion”. It is 
interesting that this quantity is already excluded from the action for a reason independent of the fact that it 
is a pseudoscalar and not a true scalar. 
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Thus S f must have the form: 
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= a J J F ik F ik dVdt, dV = dx dy dz, 

where the integral extends over all of space and the time between two given moments; a is 
some constant. Under the integral stands F ik F lk - 2{H 2 - E 2 ). The field E contains the 
derivative dA/df, but it is easy to see that (dAJdt) 2 must appear in the action with the positive 
sign (and therefore E 2 must have a positive sign). For if (dA/dt) 2 appeared in S f with a minus 
sign, then sufficiently rapid change of the potential with time (in the time interval under 
consideration) could always make S f a negative quantity with arbitrarily large absolute 
value. Consequently S f could not have a minimum, as is required by the principle of least 
action. Thus, a must be negative. 

The numerical value of a depends on the choice of units for measurement of the field. We 
note that after the choice of a definite value for a and for the units of measurement of field, 
the units for measurement of all other electromagnetic quantities are determined. 

From now on we shall use the Gaussian system of units; in this system a is a dimensionless 
quantity, equal to —(1/16/r).t 
Thus the action for the field has the form 

S, = --— f F ik F ik dQ., dQ. - c dt dx dy dz. (27.4) 

1 \6jtC J 

In three-dimensional form: 

S/= _L jV_ H 2 )dVdt. (27.5) 

In other words, the Lagrangian for the field is 

L f = ^r J (£2 -n 2)dv - (27 ’ 6) 

The action for field plus particles has the form 

S = - zj mcds - X J ^A k dx k - J f ik F ,k dQ.. (TJ.l) 

We emphasize that now the charges are not assumed to be small, as in the derivation of the 
equation of motion of a charge in a given field. Therefore A k and F ik refer to the actual field, 
that is, the external field plus the field produced by the particles themselves; A k and F ik now 
depend on the positions and velocities of the charges. 

§ 28. The four-dimensional current vector 

Instead of treating charges as points, for mathematical convenience we frequently consider 
them to be distributed continuously in space. Then we can introduce the “charge density Q 


t In addition to the Gaussian system, one also uses the Heaviside system, in which a- 4 . In this 
system of units the field equations have a more convenient form (4k does not appear) but on the other hand, 
n appears in the Coulomb law. Conversely, in the Gaussian system the field equations contain 4n, but the 
Coulomb law has a simple form. 
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such that QdV is the charge contained in the volume dV. The density Q is in general a 
function of the coordinates and the time. The integral of Q over a certain volume is the 
charge contained in that volume. 

Here we must remember that charges are actually pointlike, so that the density Q is zero 
everywhere except at points where the point charges are located, and the integral J QdV must 
be equal to the sum of the charges contained in the given volume. Therefore Q can be 
expressed with the help of the ^-function in the following formf: 

e = Ze a 8( r-r a ) (28.1) 

where the sum goes over all the charges and r a is the radius vector of the charge e u . 

The charge on a particle is, from its very definition, an invariant quantity, that is, it does 
not depend on the choice of reference system. On the other hand, the density Q is not 
generally an invariant—only the product QdV is invariant. 

Multiplying the equality de = QdV on both sides with dx L . 

de dx‘ = QdVdx‘ = QdVdt 

at 

t The 5-function S(x) is defined as follows: S(x) = 0, for all nonzero values of x; for x = 0, 5(0) = °°, in 
such a way that the integral 


J 8(x)dx = 1. 

From this definition there result the following properties: if /(a) is any continuous function, then 


and in particular. 


1 


f(x)8(x-a)dx=f(a). 


(II) 


J /(a)5(a) dx = f (0). (III) 

< The limits of integration, it is understood, need not be ± the range of integration can be arbitrary, 
provided it includes the point at which the 5-function does not vanish.) 

The meaning of the following equalities is that the left and right sides give the same result when 
introduced as factors under an integral sign: 

5(-x) = 5(a), 5(ax) = j 2 ~|£(*)- (IV) 

The last equality is a special case of the more general relation 

= 00 

where 0(a) is a single-valued function (whose inverse need not be single-valued) and the a are the roots of 
the equation 0(a) = 0. ' 

Just as 5(a) was defined for one variable a, we can introduce a three-dimensional 5-function, 5(r), equal 
to zero everywhere except at the origin of the three-dimensional coordinate system, and whose integral 
overall space is unity. As such a function we can clearly use the product 5(a) 5(y) 5( z ). 
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On the left stands a four-vector (since de is a scalar and dx l is a four-vector). This means that 
the right side must be a four-vector. But dV dt is a scalar, and so Q(dx'ldt) is a four-vector. 
This vector (we denote it by /) is called the current four-vector: 


(28.2) 


The space components of this vector form the current density vector, 

j = Q\, (28.3) 

where v is the velocity of the charge at the given point. The time component of the four- 
vector (28.2) is cQ. Thus 

j‘ = ifiQ, j). (28.4) 

The total charge present in all of space is equal to the integral J QdV over all space. We can 
write this integral in four-dimensional form: 

J QdV=^ fdV=±j j'dSj , (28.5) 

where the integral is taken over the entire four-dimensional hyperplane perpendicular to the 
a 0 axis (clearly this integration means integration over the whole three-dimensional space). 
Generally, the integral 

over an arbitrary hypersurface is the sum of the charges whose world lines pass through this 
surface. 

Let us introduce the current four-vector into the expression (27.7) for the action and 
transform the second term in that expression. Introducing in place of the point charges e a 
continuous distribution of charge with density Q, we must write this term as 

QAjdx'dV, 

replacing the sum over the charges by an integral over the whole volume. Rewriting in the 
form 


--f Q^-AidVdt, 
cj^dt' 

we see that this term is equal to 

Thus the action 5 takes the form 

5 = -X J me ds - ± J A JdQ. - J F ik F ik d£l. (28.6) 
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§ 29. The equation of continuity 

The change with time of the charge contained in a certain volume is determined by the 
derivative 

On the other hand, the change in unit time, say, is determined by the quantity of charge 
which in unit time leaves the volume and goes to the outside or, conversely, passes to its 
interior. The quantity of charge which passes in unit time through the element df of the 
surface bounding our volume is equal to Q\ ■ df, where v is the velocity of the charge at the 
point in space where the element df is located. The vector df is directed, as always, along the 
external normal to the surface, that is, along the normal toward the outside of the volume 
under consideration. Therefore £v • df is positive if charge leaves the volume, and negative 
if charge enters the volume. The total amount of charge leaving the given volume per unit 
time is consequently j Qv ■ df, where the integral extends over the whole of the closed 
surface bounding the volume. 

From the equality of these two expressions, we get 

J QdV=- | df. (29.1) 

The minus sign appears on the right, since the left side is positive if the total charge in the 
given volume increases. The equation (29.1) is the so-called equation of continuity, expressing 
the conservation of charge in integral form. Noting that Q\ is the current density, we can 
rewrite (29.1) in the form 

j edV=-jj df. (29.2) 

We also write this equation in differential form. To do this we apply Gauss’ theorem to 
(29.2): 

j) j • df = J div j dV. 

and we find 

J ( divj+ f)' l ' =a 

Since this must hold for integration over an arbitrary volume, the integrand must be zero: 
dp 

divj + -^ = 0. (29.3) 

This is the equation of continuity in differential form. 

It is easy to check that the expression (28.1) for Q in ^-function form automatically 
satisfies the equation (29.3). For simplicity we assume that we have altogether only one 
charge, so that 


Q = ed( r - r 0 ). 
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The current j is then 


THE EQUATION OF CONTINUITY 


77 


j = ex S(r - r 0 ), 

where v is the velocity of the charge. We determine the derivative dQ/dt. During the motion 
of the charge its coordinates change, that is, the vector r 0 changes. Therefore 

dg _ dg dr 0 
dt ~ <9r„ dt ' 

But dr 0 /dt is just the velocity v of the charge. Furthermore, since £ is a function of r - r 0 , 

dg dg 

dr 0 dr 

Consequently 

^ = - v • grad Q = -div(gv) 
dt 

(the velocity v of the charge of course does not depend on r). Thus we arrive at the equation 
(29.3). 

It is easily verified that, in four-dimensional form, the continuity equation (29.3) is expressed 
by the statement that the four-divergence of the current four-vector is zero: 



In the preceding section we saw that the total charge present in all of space can be written 
as 

where the integration is extended over the hyperplane x° = const. At each moment of time, 
the total charge is given by such an integral taken over a different hyperplane perpendicular 
to the x° axis. It is easy to verify that the equation (29.4) actually leads to conservation of 
charge, that is, to the result that the integral J j'dS, is the same no matter what hyperplane x° 
= const we integrate over. The difference between the integrals J j'dS, taken over two such 
hyperplanes can be written in the form j ydS ,, where the integral is taken over the whole 
closed hypersurface surrounding the four-volume between the two hyperplanes under 
consideration (this integral differs from the required integral because of the presence of the 
integral over the infinitely distant “sides” of the hypersurface which, however, drop out, 
since there are no charges at infinity). Using Gauss’ theorem (6.15) we can transform this to 
an integral over the four-volume between the two hyperplanes and verify that 

| ydS t = J |£-dQ = 0. (29.5) 

The proof presented clearly remains valid also for any two integrals J/dS,, in which the 
integration is extended over any two infinite hypersurfaces (and not just the hyperplanes x° 
= const) which each contain all of three-dimensional space. From this it follows that the 
integral 



78 


THE ELECTROMAGNETIC FIELD EQUATIONS 


is actually identical in value (and equal to the total charge in space) no matter over what such 
hypersurface the integration is taken. 

We have already mentioned (see the footnote on p. 53) the close connection between the 
gauge invariance of the equations of electrodynamics and the law of conservation of charge. 
Let us show this once again using the expression for the action in the form (28.6). On 
replacing A ( by A t -(df/dx‘), the integral 


1 



i 


j‘ ~dn 


is added to the second term in this expression. It is precisely the conservation of charge, as 
expressed in the continuity equation (29.4), that enables us to write the integrand as a four- 
divergence d(ff)/dx‘, after which, using Gauss’ theorem, the integral over the four-volume 
is transformed into an integral over the bounding hypersurface; on varying the action, these 
integrals drop out and thus have no effect on the equations on motion. 


§ 30. The second pair of Maxwell equations 

In finding the field equations with the aid of the principle of least action we must assume 
the motion of the charges to be given and vary only the potentials (which serve as the 
“coordinates” of the system); on the other hand, to find the equations of motion we assumed 
the field to be given and varied the trajectory of the particle. 

Therefore the variation of the first term in (28.6) is zero, and in the second we must not 
vary the current /. Thus, 

M = ?{;•'" W ' + 5 F ' , ^}‘' £2 = 0 ' 

(where we have used the fact that F ik 8F ik = F ik dF ik ). Substituting F ik = dA k !dx l - dAJd^ we 
have 




In the second term we interchange the indices i and k, over which the expressions are 
summed, and in addition replace F ik by -F ik . Then we obtain 


The second of these integrals we integrate by parts, that is, we apply Gauss’ theorem: 

OW'D 

In the second term we must insert the values at the limits of integration. The limits for the 
coordinates are at infinity, where the field is zero. At the limits of the time integration, that 
is, at the given initial and final time values, the variation of the potentials is zero, since in 
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§ 30 


accord with the principle of least action the potentials are given at these times. Thus the 
second term in (30.1) is zero, and we find 


j(^ + 5F^h“- a 

Since according to the principle of least action, the variations <5/4, are arbitrary, the coefficients 
of the <5A, must be set equal to zero: 


dF ik _ An_ 
dx k ~ ~ c 1 ' 


(30.2) 


Let us express these four (i = 0, 1, 2, 3) equations in three-dimensional form. For i — 1: 

dF n dF 12 dF 13 i dF w = 4 n 

dx + dy + dz + c dt c J 

Substituting the values for the components of F lk , we find 


dH z dH y j dE x _ 47T . 
dy dz c dt c Jx 

This together with the two succeeding equations (i = 2, 3) can be written as one vector 
equation: 

curlH = !^ + ^j. (30.3) 

c dt C 


Finally, the fourth equation (i = 0) gives 

div E = 4 nQ. (30.4) 

Equations (30.3) and (30.4) are the second pair of Maxwell equations.! Together with the 
first pair of Maxwell equations they completely determine the electromagnetic field, and are 
the fundamental equations of the theory of such fields, i.e. of electrodynamics. 

Let us write these equations in integral form. Integrating (30.4) over a volume and applying 
Gauss’ theorem 

J div E dV = (j) E • dt. 


we get 

j)E dt = 4nj QdV. (30.5) 

Thus the flux of the electric field through a closed surface is equal to An times the total 
charge contained in the volume bounded by the surface. 

Integrating (30.3) over an open surface and applying Stokes’ theorem 


f The Maxwell equations in a form applicable to point charges in the electromagnetic field in vacuum 
were formulated by H. A. Lorentz. 
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we find 


J curlH di = 


H • d\. 


j H ' dl= cft J E Jf + ~ (30.6) 

The quantity 


1 dE 

Andt 


(30.7) 


is called the “ displacement current”. From (30.6) written in the form 

= + <30.8) 

we see that the circulation of the magnetic field around any contour is equal to Ante times 
the sum of the true current and displacement current passing through a surface bounded by 
this contour. 

From the Maxwell equations we can obtain the already familiar continuity equation (29.3). 
Taking the divergence of both sides of (30.3), we find 

div curl H = div E + — div j. 

cdt c 3 

But div curl H = 0 and div E = AitQ, according to (30.4). Thus we arrive once more at 
equation (29.3). In four-dimensional form, from (30.2), we have: 

r) 2 F ik _ An dr 
dx‘dx k c dx‘ ' 

But when the operator cPldx'dxl", which is symmetric in the indices i and k, is applied to the 
antisymmetric tensor F lk , it gives zero identically and we arrive at the continuity equation 
(29.4) expressed in four-dimensional form. 


§ 31. Energy density and energy flux 

Let us multiply both sides of (30.3) by E and both sides of (26.1) by H and combine the 
resultant equations. Then we get 

7 E • -5- + — H - -5- = - — j • E -(H • curl E - E ■ curl H). 
c at c at c 

Using the well-known formula of vector analysis, 

div (a x b) = b • curl a - a • curl b, 

we rewrite this relation in the form 

±j- t (E 2 + H 2 ) = -^- j-E-div(ExH) 
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§-51 

or 


The vector 


JL\ 

r)t 


= - j • E - div S. 


S = 


fExH 


(31.1) 


(31.2) 


is called the Poynting vector. 

We integrate (31.1) over a volume and apply Gauss’ theorem to the second term on the 
right. Then we obtain 

jiizge-n-lt-w-j*-*- 01 . 3 ) 

If the integral extends over all space, then the surface integral vanishes (the field is zero 
at infinity). Furthermore, we can express the integral J j • EdV as a sum X e\ - E over all the 
charges, and substitute from (17.7): 


Then (31.3) becomes 

■s{J £ TF e - n, + I ^}- 0 - <3 '- 4) 

Thus for the closed system consisting of the electromagnetic field and particles present in 
it, the quantity in brackets in this equation is conserved. The second term in this expression 
is the kinetic energy (including the rest energy of all the particles; see the footnote on p. 51), 
the first term is consequently the energy of the field itself. We can therefore call the quantity 

W= £2 +H2 (31.5) 


the energy density of the electromagnetic field; it is the energy per unit volume of the field. 

If we integrate over any finite volume, then the surface integral in (31.3) generally does 
not vanish, so that we can write the equation in the form 

^ <31 - 6) 

where now the second term in the brackets is summed only over the particles present in the 
volume under consideration. On the left stands the change in the total energy of field and 
particles per unit time. Therefore the integral | S • df must be interpreted as the flux of field 
energy across the surface bounding the given volume, so that the Poynting vector S is this 
flux density—the amount of field energy passing through unit area of the surface in unit 
time.! 

t We assume that at the given moment there are no charges on the surface itself. If this were not the case, 
then on the right we would have to include the energy flux transported by particles passing through the 
surface. 
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§ 32. The energy-momentum tensor 

In the preceding section we derived an expression for the energy of the electromagnetic 
field. Now we derive this expression, together with one for the field momentum, in four¬ 
dimensional form. In doing this we shall for simplicity consider for the present an 
electromagnetic field without charges. Having in mind later applications (to the gravitational 
field), and also to simplify the calculation, we present the derivation in a general form, not 
specializing the nature of the system. So we consider any system whose action integral has 
the form 


S = J A [ 9 ’lt r ) dVdt= c J Adn ’ (32.1) 

where A is some function of the quantities q, describing the state of the system, and of their 
first derivatives with respect to coordinates and time (for the electromagnetic field the 
components of the four-potential are the quantities q)\ for brevity we write here only one of 
the q’s. We note that the space integral J A dV is the Lagrangian of the system, so that A can 
be considered as the Lagrangian “density”. The mathematical expression of the fact that the 
system is closed is the absence of any explicit dependence of A on the jc', similarly to the 
situation for a closed system in mechanics, where the Lagrangian does not depend explicitly 
on the time. 

The “equations of motion” (i.e. the field equations, if we are dealing with some field) are 
obtained in accordance with the principle of least action by varying S. We have (for brevity 
we write q n = dq/dx'). 



The second term in the integrand, after transformation by Gauss’ theorem, vanishes upon 
integration over all space, and we then find the following “equations of motion”: 


d dA dA 
dx‘ dq,i dq 


(32.2) 


(it is, of course, understood that we sum over any repeated index). 

The remainder of the derivation is similar to the procedure in mechanics for deriving the 
conservation of energy. Namely, we write: 

dA _ dA dq dA dq, k 
dx‘ dq dx‘ dq, k dx‘ 

Substituting (32.2) and noting that q k i = q i k , we find 

dA _ d ( dA \ dA dq u _ d f dA 'j 
dx‘ ~ dx k {dq* + d^dS~ ~dq7 j 


On the other hand, we can write 
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dh _ ~ k dh 

d7 ~ ' d^’ 


so that, introducing the notation 


we can express the relation in the form 


dT k 

IbS 


0 . 


(32.3) 


(32.4) 


We note that if there is not one but several quantities q (l> , then in place of (32.3) we must 
write 


(32.5) 

°<l,k 

But in § 29 we saw that an equation of the form dA k /dx k - 0, i.e. the vanishing of the four- 
divergence of a vector, is equivalent to the statement that the integral J A k dS k of the vector 
over a hypersurface which contains all of three-dimensional space is conserved. It is clear 
that an analogous result holds for the divergence of a tensor; the equation (32.4) asserts that 
the vector P‘ = const J T k dS k is conserved. 

This vector must be identified with the four-vector of momentum of the system. We 
choose the constant factor in front of the integral so that, in accord with our previous 
definition, the time component P {) is equal to the energy of the system multiplied by 1/c. To 
do this we note that 


P° = const J T ok dS k = const | T°°dV 

if the integration is extended over the hyperplane x° = const. On the other hand, according 
to (32.3), 


TOO • dh 

f A. 

dq 



Comparing with the usual formulas relating the energy and the Lagrangian, we see that 
this quantity must be considered as the energy density of the system, and therefore J T°°dV 
is the total energy of the system. Thus we must set const = 1/c, and we get finally for the 
four-momentum of the system the expression 

pi = l^T Ul dS k . (32.6) 

The tensor T lk is called the energy-momentum tensor of the system. 

It is necessary to point out that the definition of the tensor T ,k is not unique. In fact, if T‘ k 
is defined by (32.3), then any other tensor of the form 

T‘ k + ^ \f/ ik ^ y/‘ k * = — \f/^ k 
dx 1 


(32.7) 
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will also satisfy equation (32.4), since we have identically B 1 \j/ kl ldx k dx l = 0. The total four- 
momentum of the system does not change, since according to (6.17) we can write 



where the integration on the right side of the equation is extended over the (ordinary) surface 
which “bounds” the hypersurface over which the integration on the left is taken. This 
surface is clearly located at infinity in the three-dimensional space, and since neither field 
nor particles are present at infinity this integral is zero. Thus the four-momentum of the 
system is, as it must be, a uniquely determined quantity. To define the tensor 7 1 * uniquely we 
can use the requirement that the four-tensor of angular momentum (see § 14) of the system 
be expressed in terms of the four-momentum by 

M ik = J (x‘dP k - X k dP‘) = i J (* , 'r H - x k r')dS,, (32.8) 

that is its “density” is expressed in terms of the “density” of momentum by the usual 
formula. 

It is easy to determine what conditions the energy-momentum tensor must satisfy in order 
that this be valid. We note that the law of conservation of angular momentum can be 
expressed, as we already know, by setting equal to zero the divergence of the expression 
under the integral sign in M ,k . Thus 

x l T kl - x k T a ) = 0. (32.9) 

dx‘ 

Noting that dx'/dx 1 = S} and that dT k, ldx l - 0, we find from this 
SjT kl - 8 k T a = T ki - T ik = 0 
or 

T ik = T ki , (32.10) 

that is, the energy-momentum tensor must be symmetric. 

We note that T' k , defined by formula (32.5), is generally speaking not symmetric, but can 
be made so by transformation (32.7) with suitable y/ kl . Later on (§ 94) we shall see that there 
is a direct method for obtaining a symmetric tensor T tk . 

As we mentioned above, if we carry out the integration in (32.6) over the hyperplane x° 
= const., then P' takes on the form 

P'=l|r ,0 dV, (32.11) 

where the integration extends over the whole (three-dimensional) space. The space components 
of P‘ form the three-dimensional momentum vector of the system and the time component 
is its energy multiplied by 1/c. Thus the vector with components 



may be called the “momentum density ”, and the quantity 
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W= T 


00 


the “energy density ”. 

To clarify the meaning of the remaining components of T ,k , we separate the conservation 
equation (32.4) into space and time parts: 


1 dT 00 dT 0a i dT a0 dT a P 

c dt + dx a ’ C dt + dxP 

We integrate these equations over a volume V in space. From the first equation 


(32.12) 


f T°°dV+ f ^^dV= 0 
cdt J J dx a 

or, transforming the second integral by Gauss’ theorem. 


J T°°dV = - c | T 0a df a , 


(32.13) 


where the integral on the right is taken over the surface surrounding the volume V ( df x , df y , 
df z are the components of the three-vector of the surface element df). The expression on the 
left is the rate of change of the energy contained in the volume V; from this it is clear that 
the expression on the right is the amount of energy transferred across the boundary of the 
volume V, and the vector S with components 

cT 01 , cT 02 , cT 03 

is its flux density—the amount of energy passing through unit surface in unit time. Thus we 
arrive at the important conclusion that the requirements of relativistic invariance, as expressed 
by the tensor character of the quantities T ,k , automatically lead to a definite connection 
between the energy flux and the momentum density: the energy flux density is equal to the 
momentum density multiplied by c 2 . 

From the second equation in (32.12) we find similarly: 

^ J ±T a0 dV= - j) T^dfp. (32.14) 


On the left is the change of the momentum of the system in volume V per unit time, therefore 
j T'^d/p is the momentum emerging from the volume V per unit time. Thus the components 
T of the energy-momentum tensor constitute the three-dimensional tensor of momentum 
flux density; we denote it by -o a p, where o a p is the stress tensor. The energy flux density 
is a vector; the density of flux of momentum, which is itself a vector, must obviously be a 
tensor (the component T a p of this tensor is the amount of the a-component of the momentum 
passing per unit time through unit surface perpendicular to the x@ axis). 

We give a table indicating the meanings of the individual components of the energy- 
momentum tensor: 


W SJc Sy/C 

S x /C -0„ -Oxy 

Sy/C ~ (7y X -Cyy 

SJc -a zx -o zy 



(32.15) 
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§ 33. Energy-momentum tensor of the electromagnetic field 

We now apply the general relations obtained in the previous section to the electromagnetic 
field. For the electromagnetic field, the quantity standing under the integral sign in (32.1) is 
equal, according to (27.4), to 


The quantities q are the components of the four-potential of the field, A k , so that the definition 
(32.5) of the tensor T k becomes 


dA t dA 


- 5 k A. 


<(S) 

appear hei 

SA = - -L F k, SF kl = - 

$tc 8 n dx k dx J 


To calculate the derivatives of A which appear here, we find the variation SA. We have 


or, interchanging indices and making use of the fact that F kI = ~F lk , 


SA = -^-F k, S 


An 


dAj_ 




dx k 

- _ J_ pkl 

~ An ' 


or, for the contravariant components: 

ik _ _ J_ dA}_ k J_ lm 

An dxj 1 16^-^ ,m 

But this tensor is not symmetric. To symmetrize it we add the quantity 

1 rlA ‘ r k 

An dx, 1 ' 

According to the field equation (30.2) in the absence of charges, dF k ldx, - 0, and therefore 


F k^J__d_ (A i F u, 

An dx, ~ An dx 1 ( h 

so that the change made in T k is of the form (32.7) and is admissible. Since dA l !dx, - 
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dA'Idxi = F u , we get finally the following expression for the energy-momentum tensor of the 
electromagnetic field: 

<330 

This tensor is obviously symmetric. In addition it has the property that 

t; = 0, (33 2) 

i.e. the sum of its diagonal terms is zero. 

Let us express the components of the tensor T ik in terms of the electric and magnetic field 
intensities. By using the values (23.5) for the components F‘ k , we easily verify that the 
quantity T°° coincides with the energy density (31.5), while the components cT° a are the 
same as the components of the Poynting vector (31.2). The space components T afi form a 
three-dimensional tensor with components 

-a „ = _L(£2 + e\ - El + Hi + Hi - Hi ), 

o7l 

-a xv = -J~(E x E y + H x H y ), 
y 4/r y 

etc., or 

e a p=^{+E a E l3+ H a H l3 -±S al3 (E 2 + H 2 )}. (33.3) 

This tensor is called the Maxwell strees tensor. 

To bring the tensor T ik to diagonal form, we must transform to a reference system in which 
the vectors E and H (at the given point in space and moment in time) are parallel to one 
another or where one of them is equal to zero; as we know (§ 25), such a transformation is 
always possible except when E and H are mutually perpendicular and equal in magnitude. 
It is easy to see that after the transformation the only non-zero components of T will be 
7’00 _ _ J’ll _ y^ 22 _ y-33 _ jy 

(the x axis has been taken along the direction of the field). 

But if the vectors E and H are mutually perpendicular and equal in magnitude, the tensor 
T k cannot be brought to diagonal form.t The non-zero components in this case are 
jOO _ y-33 _ y30 _ jy 

(where the x axis is taken along the direction of E and the y axis along H). 

Up to now we have considered fields in the absence of charges. When charged particles 
are present, the energy-momentum tensor of the whole system is the sum of the energy- 
momentum tensors for the electromagnetic field and for the particles, where in the latter the 
particles are assumed not to interact with one another. 

To determine the form of the energy-momentum tensor of the particles we must describe 
their mass distribution in space by using a “mass density” in the same way as we describe 

t The fact that the reduction of the symmetric tensor T' k to principal axes may be impossible is related 
to the fact that the four-space is pseudo-euclidean. (See also the problem in § 94.) 
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a distribution of point charges in terms of their density. Analogously to formula (28.1) for 
the charge density, we can write the mass density in the form 

F = X m a S(r - r a ), (33.4) 

where r a are the radius-vectors of the particles, and the summation extends over all the 
particles of the system. 

The “four-momentum density” of the particles is given by /tew,. We know that this density 
is the component T° a /c of the energy-momentum tensor, i.e. T° x = jic 2 u a (a =1,2, 3). But 
the mass density is the time component of the four-vector F/c(dx k /dt) (in analogy to the 
charge density; see § 28). Therefore the energy-momentum tensor of the system of non¬ 
interacting particles is 


dx‘ dx k . ds 

= ^ C -17^T = ^ CU M 717- 


(33.5) 


As expected, this tensor is symmetric. 

We verify by a direct computation that the energy and momentum of the system, defined 
as the sum of the energies and momenta of field and particles, are actually conserved. In 
other words we shall verify the equation 

^_ (r ( /)f + r ( P)f ) = 0 , ( 33 . 6 ) 

which expresses these conservation laws. 

Differentiating (33.1), we write 


lfi r ,.<?A, iF, 

dx k 4ny2 dx‘ dx k 

Substituting from the Maxwell equations (26.5) and (30.2), 

dF^_ = An., r)F lm _ dF mi 
dx k c J ’ dx‘ ~ dx 1 


dF kl 
dx k 11 


dF u 
dx m ’ 


) 


we have: 


dpn k 

dx k 


J-(_ I 

4n( 2 


r)F mi 
2 dx 1 1 


1 dF a , 
' 2 dx m 


- d F ‘ l p kl 47r F i 1 1 
}x‘ F -~ F ‘ J • 


By permuting the indices, we easily show that the first three terms o 
another, and we arrive at the result: 


the right cancel one 


dT ( F k 

dx k 


-~Fid k . 


(33.7) 


Differentiating the expression (33.5) for the energy-momentum tensor of the particles gives 


dT^j 

dx k 


‘ dx k 



+ Fc 


dx k diij 
dt dx k 
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The first term in this expression is zero because of the conservation of mass for non¬ 
interacting particles. In fact, the quantities fi{d/ldt) constitute the “mass current” four- 
vector, analogous to the charge current four-vector (28.2); the conservation of mass is 
expressed by equating to zero the divergence of this four-vector: 



just as the conservation of charge is expressed by equation (29.4). 
Thus we have: 


(33.8) 


dT lp) * 


dx k diij duj 
dt dx k ~^ C ~dP 


Next we use the equation of motion of the charges in the field, expressed in the four¬ 
dimensional form (23.4). 


me 


du t 

ds 


F ik u k . 


Changing to continuous distributions of charge and mass, we have, from the definitions of 
the densities fi and Q: film - Qle. We can therefore write the equation of motion in the form 


lie 


duj 

ds 


k 


diij l _. k ds 

s’ 


7 HA 


Thus, 


dr<*>* 

dx k 


(33.9) 


Combining this with (33.7), we find that we actually get zero, i.e. we arrive at equation (33.6). 


PROBLEM 

Find the law of transformation of the energy density, the energy flux density, and the components of the 
stress tensor under a Lorentz transformation. 

Solution: Suppose that the K' coordinate system moves relative to the K system along the x axis with 
velocity V. Applying the formulas of problem 1, § 6 to the symmetric tensor T‘ k , we find: 




90 


THE ELECTROMAGNETIC FIELD EQUATIONS 


34 


S y = 



(s; - V(j' y >, 



Cyy = Oyy , O „ = a^. , Gy, = o'y, , 



and similar formulas for S z and o xz . 


§ 34. The virial theorem 


Since the sum of the diagonal terms of the energy-momentum tensor of the electromagnetic 
field is equal to zero, the sum TV for any system of interacting particles reduces to the trace 
of the energy-momentum tensor for the particles alone. Using (33.5), we therefore have: 


Let us rewrite this result, shifting to a summation over the particles, i.e. 
(33.4). We then get finally: 


writing ji as the sum 


We note that, according to this formula, we have for every system: 


(34.1) 


V ^ °. (34.2) 

where the equality sign holds only for the electromagnetic field without charges. 

Let us consider a closed system of charged particles carrying out a finite motion, in which 
all the quantities (coordinates, momenta) characterizing the system vary over finite ranges, f 
We average the equation 

1 dT a0 dT°P _ 

c dt dx& 

[see (32.11)] with respect to the time. The average of the derivative dT a0 /dt, like the average 
of the derivative of any bounded quantity, is zero.t Therefore we get 


t Here we also assume that the electromagnetic field of the system goes to zero sufficiently rapidly at 
infinity. In specific cases this condition may require the neglect of radiation of electromagnetic waves by 
the system. 

$ Let/(r) be such a quantity. Then the average value of the derivative dfldt over a certain time interval 7" is 


df_ 

dt 


+1 


df f{T) —/(0) 
dt ~ T ' 


Since/(f) varies only within finite limits, then as T increases without limit, the average value of dfldt clearly 
goes to zero. 
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d 

dx? 


= 0 . 


We multiply this equation by x a and integrate over all space. We transform the integral by 
Gauss’ theorem, keeping in mind that at infinity t£ = 0, and so the surface integral vanishes: 




or finally. 


J Ta dV = 0 . 

On the basis of this equality we can write for the integral of T‘ 


J T‘dV= J T°dV=if, 

where ^ is the total energy of the system. 

Finally, substituting (34.1) we get: 


(34.3) 


§f = X m a c 2 



(34.4) 


This relation is the relativistic generalization of the virial theorem of classical mechanics. 
(See Mechanics, § 10.) For low velocities, it becomes 


(f - X m a c 2 = - X , 

that is, the total energy (minus the rest energy) is equal to the negative of the average value 
of the kinetic energy—in agreement with the result given by the classical virial theorem for 
a system of charged particles (interacting according to the Coulomb law). 

We must point out that our formulas have a quite formal character and need to be made 
more precise. The point is that the electromagnetic field energy contains terms that give an 
infinite contribution to the electromagnetic self-energy of point charges (see § 37). To give 
mi nin g to the corresponding expressions we should omit these terms, considering that the 
intrinsic electromagnetic energy is already included in the kinetic energy of the particle 
(9.4). This means that we should “renormalize” the energy making the replacement 




in (34.4), where E fl and H fl are the fields produced by the a’th particle. Similarly in (34.3) 
we should make the replacement! 


J T a a dV-> J T“ dV + X E “ + n H " dV. 

t Note that without this change the expression - J T a ° dV = Jfi 

87T dV + 


positive and cannot vanish. 


Z m a v a __ j sessen tiaiiy 

“ Ji 
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§ 35. The energy-momentum tensor for macroscopic bodies 

In addition to the energy-momentum tensor for a system of point particles (33.5), we shall 
also need the expression for this tensor for macroscopic bodies which are treated as being 
continuous. 

The flux of momentum through the element df of the surface of the body is just the force 
acting on this surface element. Therefore -o ap df p is the a-component of the force acting on 
the element. Now we introduce a reference system in which a given element of volume of 
the body is at rest. In such a reference system, Pascal’s law is valid, that is, the pressure p 
applied to a given portion of the body is transmitted equally in all directions and is every¬ 
where perpendicular to the surface on which it acts.f Therefore we can write o„ p df p = -pdf a , 
so that the stress tensor is o afj = - p8 ap . As for the components T“°, which represent the 
momentum density, they are equal to zero for the given volume element in the reference 
system we are using. The component T 00 is as always the energy density of the body, which 
we denote by e; e/c 2 is then the mass density of the body, i.e. the mass per unit volume. We 
emphasize that we are talking here about the unit “proper” volume, that is, the volume in the 
reference system in which the given portion of the body is at rest. 

Thus, in the reference system under consideration, the energy-momentum tensor (for the 
given portion of the body) has the form: 


( £ 0 0 0 \ 



^ 0 0 0 p J 


Now it is easy to find the expression for the energy-momentum tensor in an arbitrary 
reference system. To do this we introduce the four-velocity u ' for the macroscopic motion of 
an element of volume of the body. In the rest frame of the particular element, u‘ = (1, 0). The 
expression for T' k must be chosen so that in this reference system it takes on the form (35.1). 
It is easy to verify that this is 

T ik = (p + £)u‘u k - pg ik , (35.2) 

or, for the mixed components. 


T k = (p + e)uiU k - pS k . 


This expression gives the energy-momentum tensor for a macroscopic body. The expressions 
for the energy density W, energy flow vector S and stress tensor o ap are: 


W = 



(p + £)y 
' 1 v 2 ’ 


(35.3) 


t Strictly speaking, Pascal’s law is valid for liquids and gases. However, for solid bodies the maximum 
possible difference in the stress in different directions is negligible in comparison with the stresses which 
can play a role in the theory of relativity, so that its consideration is of no interest. 
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(p + £)v a v p 



~ P S afi • 


If the velocity vof the macroscopic motion is small compared with the velocity of light, then 
we have approximately: 

S = (p + £)v. 

Since S/c 2 is the momentum density, we see that in this case the sum (p + £)/c 2 plays the role 
of the mass density of the body. 

The expression for T‘ k simplifies in the case where the velocities of all the particles 
making up the body are small compared with the velocity of light (the velocity of the 
macroscopic motion itself can be arbitrary). In this case we can neglect, in the energy 
density £, all terms small compared with the rest energy, that is, we can write p^c 2 in place 
of £, where p f) is the sum of the masses of the particles present in unit (proper) volume of 
the body (we emphasize that in the general case, p () must differ from the actual mass density 
elc 2 of the body, which includes also the mass corresponding to the energy of microscopic 
motion of the particles in the body and the energy of their interactions). As for the pressure 
determined by the energy of microscopic motion of the molecules, in the case under 
consideration it is also clearly small compared with the rest energy p^c 2 . Thus we find 

T lk = Pff 2 u'u k . (35.4) 

From the expression (35.2), we get 

Tf = e-3p. (35.5) 

The general property (34.2) of the energy-momentum tensor of an arbitrary system now 
shows that the following inequality is always valid for the pressure and density of a macroscopic 
body: 


P < f - (35.6) 

Let us compare the relation (35.5) with the general formula (34.1) which we saw was valid 
for an arbitrary system. Since we are at present considering a macroscopic body, the expression 
(34.1) must be averaged over all the values of r in unit volume. We obtain the result 


£-3 p = T*m a c 2 



(35.7) 


(the summation extends over all particles in unit volume). 

The right side of this equation tends to zero in the ultrarelativistic limit, so in this limit the 
equation of state of matter is: f 


P 


£ 

3' 


(35.8) 


t This limiting equation of state is obtained here assuming an electromagnetic interaction between the 
particles. We shall assume (when this is needed in Chapter 14) that it remains valid for the other possible 
interactions between particles, though there is at present no proof of this assumption. 
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We apply our formula to an ideal gas, which we assume to consist of identical particles. 
Since the particles of an ideal gas do not interact with one another, we can use formula 
(33.5) after averaging it. Thus for an ideal gas. 




dx k 
ds ’ 


where n is the number of particles in unit volume and the dash means an average over all the 
particles. If there is no macroscopic motion in the gas then we can use for T ik the expression 
(35.1). Comparing the two formulas, we arrive at the equations: 



(35.9) 


These equations determine the density and pressure of a relativistic ideal gas in terms of the 
velocity of its particles; the second of these replaces the well-known formula p = nmv^B 
of the nonrelativistic kinetic theory of gases. 


CHAPTER 5 
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§ 36. Coulomb’s law 

For a constant electric, or as it is usually called, electrostatic field, the Maxwell equations 
have the form: 


div E = 4np, (36.1) 

curl E = 0. (36.2) 


The electric field E is expressed in terms of the scalar potential alone by the relation 

E = - grad 0. (36.3) 

Substituting (36.3) in (36.1), we get the equation which is satisfied by the potential of a 
constant electric field: 

A <j> = - 4np. (36.4) 

This equation is called the Poisson equation. In particular, in vacuum, i.e., for Q = 0, the 
potential satisfies the Laplace equation 

A<j> = 0. (36.5) 


From the last equation it follows, in particular, that the potential of the electric field can 
nowhere have a maximum or a minimum. For in order that <j> have an extreme value, it would 
be necessary that the first derivatives of 0 with respect to the coordinates be zero, and that 
the second derivatives O 2 0/A 2 , d^tp/dy 2 , d~(p/dz^ all have the same sign. The last is impossible, 
since in that case (36.5) could not be satisfied. 

We now determine the field produced by a point charge. From symmetry considerations, 
it is clear that it is directed along the radius-vector from the point at which the charge e is 
located. From the same consideration it is clear that the value E of the field depends only on 
the distance R from the charge. To find this absolute value, we apply equation (36.1) in the 
integral form (30.5). The flux of the electric field through a spherical surface of radius R 
circumscribed around the charge e is equal to 4 nR 2 E; this flux must equal 4ne. From this we 
get 



In vector notation: 


E 


eR 

R 3 ' 


(36.6) 
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Thus the field produced by a point charge is inversely proportional to the square of the 
distance from the charge. This is the Coulomb law. The potential of this field is, clearly, 

(36.7) 

If we have a system of charges, then the field produced by this system is equal, according 
to the principle of superposition, to the sum of the fields produced by each of the particles 
individually. In particular, the potential of such a field is 


<j> = Z - 


where R a is the distance from the charge e a to the point at which we are determining the 
potential. If we introduce the charge density Q, this formula takes on the form 


P v - 


(36.8) 


where R is the distance from the volume element dV to the given point of the field. 

We note a mathematical relation which is obtained from (36.4) by substituting the values 
of Q and 0 for a point charge, i.e. Q = c5(R) and 0 = e/R. We then find 


= ~ 4nS W- 


(36.9) 


§ 37. Electrostatic energy of charges 

We determine the energy of a system of charges. We start from the enegy of the field, that 
is, from the expression (31.5) for the energy density. Namely, the energy of the system of 
charges must be equal to 

u ^j E2 “ v - 

where E is the field produced by these charges, and the integral goes over all space. Substituting 
E = - grad <j), U can be changed to the following form: 

E-grad <t> dV =~^ j div (E <p)dV+±j <j>divEdV. 

According to Gauss’ theorem, the first integral is equal to the integral of E0 over the surface 
bounding the volume of integration, but since the integral is taken over all space and since 
the field is zero at infinity, this integral vanishes. Substituting in the second integral, div E 
= AnQ, we find the following expression for the energy of a system of charges: 

U=^Q*dV. (37.1) 

For a system of point charges, e a , we can write in place of the integral a sum over the charges 

U=^Le a ^, (37.2) 

where <f> a is the potential of the field produced by all the charges, at the point where the 
charge e a is located. 
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If we apply our formula to a single elementary charged particle (say, an electron), and the 
field which the charge itself produces, we arrive at the result that the charge must have a 
certain “self’-potential energy equal to e<p/2, where <j) is the potential of the field produced 
by the charge at the point where it is located. But we know that in the theory of relativity 
every elementary particle must be considered as pointlike. The potential <j) = e/R of its field 
becomes infinite at the point R = 0. Thus according to electrodynamics, the electron would 
have to have an infinite “self-energy”, and consequently also an infinite mass. The physical 
absurdity of this result shows that the basic principles of electrodynamics itself lead to the 
result that its application must be restricted to definite limits. 

We note that in view of the infinity obtained from electrodynamics for the self-energy and 
mass, it is impossible within the framework of classical electrodynamics itself to pose the 
question whether the total mass of the electron is electrodynamic (that is, associated with the 
electromagnetic self-energy of the particle).t 

Since the occurrence of the physically meaningless infinite self-energy of the elementary 
particle is related to the fact that such a particle must be considered as pointlike, we can 
conclude that electrodynamics as a logically closed physical theory presents internal 
contradictions when we go to sufficiently small distances. We can pose the question as to the 
order of magnitude of such distances. We can answer this question by noting that for the 
electromagnetic self-energy of the electron we should obtain a value of the order of the rest 
energy me 2 . If, on the other hand, we consider an electron as possessing a certain radius R 0 , 
then its self-potential energy would be of order e 2 /R 0 . From the requirement that these two 
quantities be of the same order, e 2 /R 0 ~ me 2 , we find 

Ro~~^2- (37.3) 

me 

This dimension (the “radius” of the electron) determines the limit of applicability of 
electrodynamics to the electron, and follows already from its fundamental principles. We 
must, however, keep in mind that actually the limits of applicability of the classical 
electrodynamics which is presented here lie must higher, because of the occurrence of 
quantum phenomena. :j: 

We now turn again to formula (37.2). The potentials <p a which appear there are equal, from 
Coulomb’s law, to 


<Pa=Z-^, (37.4) 

K ab 

where R ab is the distance between the charges e a , e b . The expression for the energy (37.2) 
consists of two parts. First, it contains an infinite constant, the self-energy of the charges, not 
depending on their mutual separations. The second part is the energy of interaction of the 
charges, depending on their separations. Only this part has physical interest. It is equal to 

U'=\ (37.5) 


t From the purely formal point of view, the finiteness of the electron mass can be handled by introducing 
an infinite negative mass of nonelectromagnetic origin which compensates the infinity of the electromagnetic 
mass (mass “renormalization”). However, we shall see later (§ 75) that this does not eliminate all the 
internal contradictions of classical electrodynamics. 

| Quantum effects become important for distances of the order of hi me, where h is Planck’s constant. The 
ratio of these distances to R 0 is of order hc/e 2 ~ 137. 
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(37 - 6 > 

is the potential at the point of location of e a , produced by all the charges other than e a . In 
other words, we can write 


u - 2 kb R ah ■ 

In particular, the energy of interaction of two charges is 

rr'=£l£L 


(37.7) 


(37.8) 


§ 38. The field of a uniformly moving charge 

We determine the field produced by a charge e, moving uniformly with velocity V. We call 
the laboratory frame the system K; the system of reference moving with the charge is the K’ 
system. Let the charge be located at the origin of coordinates of the K’ system. The system 
K’ moves relative to K along the X axis; the axes Y and Z are parallel to Y' and Z'. At the time 
t - 0 the origins of the two systems coincide. The coordinates of the charge in the K system 
are consequently x = Vt, y = z = 0. In the K’ system, we have a constant electric field with 
vector potential A' = 0, and scalar potential equal to <p' = elR', where R' 2 = x 2 + y' 2 + z 2 . In 
the K system, according to (24.1) for A' = 0, 



We must now express R' in terms of the coordinates x, y, z, in the K system. According to 
the formulas for the Lorentz transformation 


from which 



R' 2 = 


(x - Vt) 2 + [ 

l-^\y 2 + z 2 ) 


c J 


Substituting this in (38.1) we find 


(38.2) 


4> = -k 


(38.3) 


where we have introduced the notation 
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R * 2 =(x- Vt ) 2 + ^1 - ^-j(y 2 + z2 >- (38.4) 

The vector potential in the K system is equal to 

A = <t>T = 3- ^ 

In the K' system the magnetic field H' is absent and the electric field is 



From formula (24.2), we find 




Substituting for R', x', y', z, their expressions in terms of x, y, z, we obtain 


E = 



(38.6) 


where R is the radius vector from the charge e to the field point with coordinates x, y, z (its 
components are jc - Vt, y, z). 

This expression for E can be written in another form by introducing the angle 6 between 
the direction of motion and the radius vector R. It is clear that y 2 + z 2 = R 2 sin 2 6, and 
therefore R* 2 can be written in the form: 


Then we have for E, 


! =i? 2 |^l 



E = 



(38.7) 


(38.8) 


For a fixed distance R from the charge, the value of the field E increases as 6 increases 
from 0 to nil (or as 6 decreases from it to nil). The field along the direction of motion 
(6 = 0, n) has the smallest value; it is equal to 
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The largest field is that perpendicular to the velocity (6 = nil), equal to 



We note that as the velocity increases, the field £„ decreases, while E L increases. We can 
describe this pictorially by saying that the electric field of a moving charge is “contracted” 
in the direction of motion. For velocities V close to the velocity of light, the denominator in 
formula (38.8) is close to zero in a narrow interval of values 6 around the value 6 = nil. The 
“width” of this interval is, in order of magnitude. 



Thus the electric field of a rapidly moving charge at a given distance from it is large only 
in a narrow range of angles in the neighbourhood of t he equ atorial plane, and the width of 
this interval decreases with increasing Vlike - ( V 2 lc 2 ). 

The magnetic field in the K system is 


H= c VxE (38.9) 

[see (24.5)]. In particular, for V « c the electric field is given approximately by the usual 
formula for the Coulomb law, E = eR/R 3 , and the magnetic field is 


(38.10) 


PROBLEM 

Determine the force (in the K system) between two charges moving with the same velocity V. 

Solution: We shall determine the force F by computing the force acting on one of the charges (e{) in the 
field produced by the other (e 2 ). Using (38.9), we have 

F = e,E 2 +^-VxH 2 =e J\\-'C r y i + ^-V(V-E 2 ). 

Substituting for E 2 from (38.8), we get for the components of the force in the direction of motion (F x ) and 
perpendicular to it (/',,): 



where R is the radius vector from e 2 to e t , and 6 is the angle between R and V. 

§ 39. Motion in the Coulomb field 

We consider the motion of a particle with mass m and charge e in the field produced by 
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a second charge e'\ we assume that the mass of this second charge is so large that it can be 
considered as fixed. Then our problem becomes the study of the motion of a charge e in a 
centrally symmetric electric field with potential <j> - elr. 

The total energy <^of the particle is equal to 

where a = ee. If we use polar coordinates in the plane of motion of the particle, then as we 
know from mechanics. 


p 2 = ( M 2 /r 2 ) + p 2 , 

where p r is the radial component of the momentum, and M is the constant angular momentum 
of the particle. Then 

if = c ^jp? + ^j L + m 2 c 2 + ^ . (39.1) 

We discuss the question whether the particle during its motion can approach arbitrarily close 
to the centre. First of all, it is clear that this is never possible if the charges e and e repel each 
other, that is, if e and e have the same sign. Furthermore, in the case of attraction (e and e 
of opposite sign), arbitrarily close approach to the centre is not possible if Me > I a I, for 
in this case the first term in (39.1) is always large than the second, and for r 0 the right 
side of the equation would approach infinity. On the other hand, if Me < I a I, then as n 
0, this expression can remain finite (here it is understood that p r approaches infinity). 
Thus, if 


cM < I a I, (39.2) 

the particle during its motion “falls in” toward the charge attracting it, in contrast to non- 
relativistic mechanics, where for the Coulomb field such a collapse is generally impossible 
(with the exception of the one case M = 0, where the particle e moves on a line toward the 
particle e'). 

A complete determination of the motion of a charge in a Coulomb field starts most 
conveniently from the Hamilton-Jacobi equation. We choose polar coordinates r, <p, in the 
plane of the motion. The Hamilton-Jacobi equation (16.11) has the form 



We seek an S of the form 

5 = -^f + M0 + /(r), 

where if and M are the constant energy and angular momentum of the moving particle. The 
result is 



(39.3) 


The trajectory is determined by the equation dSldM= const. Integration of (39.3) leads to the 
following results for the trajectory: 
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(a) If A/c> I a I, 


(c 2 M 2 - a 2 ) i = c^J(Mef) 2 - m 2 c 2 (M 2 c 2 - a 2 ) cos ^1 - j ~^ a - (39.4) 

(b) If Me < I a I, 

(a 2 - M 2 c 2 )j = + c^(M(f ) 2 +m 2 c 2 (a 2 - M 2 c 2 ) cosh ^ ^ 2^2 ~ 1 j + 


(c) If Me = I a I, 


(39.5) 


(39.6) 


The integration constant is contained in the arbitrary choice of the reference line for 
measurement of the angle <p. 

In (39.4) the ambiguity of sign in front of the square root is unimportant, since it already 
contains the arbitrary reference origin of the angle <j> under the cos. In the case of attraction 
(a < 0) the trajectory corresponding to this equation lies entirely at finite values of r (finite 
motion), if ef< me 2 , lief > me 2 , then r can go to infinity (infinite motion). The finite motion 
corresponds to motion in a closed orbit (ellipse) in nonrelativistic mechanics. From (39.4) 
it is clear that in relativistic mechanics the trajectory can never be closed; when the angle 0 
changes by lit, the distance r from the centre does not return to its initial value. In place of 
ellipses we here get orbits in the form of open “rosettes”. Thus, whereas in nonrelativistic 
mechanics the finite motion in a Coulomb field leads to a closed orbit, in relativistic mechanics 
the Coulomb field loses this property. 

In (39.5) we must choose the positive sign for the root in case a < 0, and the negative sign 
if a > 0 [the opposite choice of sign would correspond to a reversal of the sign of the root 
in (39.1)]. 

For a < 0 the trajectories (39.5) and (39.6) are spirals in which the distance r approaches 
0 as 0 00 . The time required for the “falling in” of the charge to the coordinate origin is 

finite. This can be verified by noting that the dependence of the coordinate r on the time is 
determined by the equation dSldef= const; substituting (39.3), we see that the time is determined 
by an integral which converges for r —» 0. 


PROBLEMS 

1. Determine the angle of deflection of a charge passing through a repulsive Coulomb field (a > 0). 
Solution: The angle of deflection x equals %=n- 20 o , where 20 q is the angle between the two asymptotes 
of the trajectory (39.4). We find 

v-^c^M^-a 2 | 
ca I’ 


where v is the velocity of the charge at infinity. 



2. Determine the effective scattering cross section at small angles for the scattering of particles in a 
Coulomb field. 

Solution: The effective cross section da is the ratio of the number of particles scattered per second into 
a given element do of solid angle to the flux density of impinging particles (i.e., to the number of particles 
crossing one square centimetre, per second, of a surface perpendicular to the beam of particles). 

Since the angle of deflection X of the particle during its passage through the field is determined by the 
impact parameter Q (i.e. the distance from the centre to the line along which the particle would move in the 
absence of the field), 

, „ „ dp j dp do 

do= 2 nede = 2 kQ ~^ d X = 

where do = 2n sin X d X-^ The angle of deflection (for small angles) can be taken equal to the ratio of the 
change in momentum to its initial value. The change in momentum is equal to the time integral of the force 
acting on the charge, in the direction perpendicular to the direction of motion; it is approximately 
(air 1 ) ■ ( Q/r ). Thus we have 


I 


(i e 2 


aQ dt 2 a 

+ V 2 ! 2 ) 372 


(vis the velocity of the particles). From this we find the effective cross section for small X- 


do 
X 4 ‘ 


In the nonrelativistic case, p = mv, and the expression coincides with the ■ 
formulal for small X- 


: obtained from the Rutherford 


§ 40. The dipole moment 

We consider the field produced by a system of charges at large distances, that is, at 
distances large compared with the dimensions of the system. 

We introduce a coordinate system with origin anywhere within the system of charges. Let 
the radius vectors of the various charges be r a . The potential of the field produced by all the 
charges at the point having the radius vector R 0 is 

♦-?nET^n mi) 

(the summation goes over all charges); here R 0 - r fl are the radius vectors from the charges 
e a to the point where we are finding the potential. 

We must investigate this expansion for large R 0 (Rp » r fl ). To do this, we expand it in 
powers or r a /R 0 , using the formula 

/(R 0 - r) =/(R 0 ) - r • grad/(Ro) 

(in the grad, the differentiation applies to the coordinates of the vector Rp). To terms of first 
order, 

<j> = ^ -£ e a r a • grad-^. (40.2) 


f See Mechanics, § 18. 
| See Mechanics, § 19. 



104 


CONSTANT ELECTROMAGNETIC FIELDS 


The sum 


§ 40 


d = E e a r a (40.3) 

is called the dipole moment of the system of charges. It is important to note that if the sum 
of all the charges, Ye a , is zero, then the dipole moment does not depend on the choice of the 
origin of coordinates, for the radius vectors r fl and r' of one and the same charge in two 
different coordinate systems are related by 

r' = r a + a, 

where a is some constant vector. Therefore if £e a = 0, the dipole moment is the same in both 
systems: 

d' = Ie s r' =2,e a r a +ale„ = d. 

If we denote by e +, r* and e ", r“ the positive and negative charges of the system and 
their radius vectors, then we can write the dipole moment in the form 


d = Z e + a r* - Z e-r~ = R+ Z e + a - R“ Ze" 

where 


(40.4) 


R + 


Z e* r* 

z e „ + ’ 


R = 


Z e~r~ 
Zr- 


(40.5) 


are the radius vectors of the “charge centres” for the positive and negative charges. If 

If,* = X e„ = e, then 


d = eR + _, (40.6) 

where R+_ = R + - R" is the radius vector from the centre of negative to the centre of positive 
charge. In particular, if we have altogether two charges, then R+ _ is the radius vector 
between them. 

If the total charge of the system is zero, then the potential of the field of this system at 
large distances is 




(40.7) 


The field intensity is: 


E = - grad 


l -^ SL = ~ -igrad(d R 0 ) - (d • R 0 ) grad 
« 0 K 0 Rq 


or finally. 


E = 3 ( n ~ d)n - d 

R 3 o 

where n is a unit vector along R 0 . Another useful expression for the field is 


E = (d • V)V-1-, 


(40.8) 


(40.9) 


Thus the potential of the field at large distances produced by a system of charges with total 
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§ 41 

charge equal to zero is inversely proportional to the square of the distance, and the field 
intensity is inversely proportional to the cube of the distance. This field has axial symmetry 
around the direction of d. In a plane passing through this direction (which we choose as the 
z axis), the components of the vector E are: 


K K 

The radial and tangential components in this plane are 

E = d 2 cos 6 E _ d sm6 

Ri Ri 


(40.10) 


(40.11) 


§ 41. Multipole moments 

In the expansion of the potential in powers of l//? 0 , 

(p = 0 (O) + 0 (1) + 0 (2) + ... , (41.1) 

the term 0 (n) is proportional to 1 /R £ +1 . We saw that the first term, 0 (O) , is determined by the 
sum of all the charges; the second term, 0 (1) , sometimes called the dipole potential of the 
system, is determined by the dipole moment of the system. 

The third term in the expansion is 

(41.2) 

where the sum goes over all charges; we here drop the index numbering the charges; x a are 
the components of the vector r, and X a those of the vector R 0 . This part of the potential is 
usually called the quadrupole potential. If the sum of the charges and the dipole moment of 
the system are both equal to zero, the expansion begins with <p (2 \ 

In the expression (41.2) there enter the six quantities Hex a xp. However, it is easy to see 
that the field depends not on six independent quantities, but only on five. This follows from 
the fact that the function 1 /R 0 satisfies the Laplace equation, that is. 


o d 2 f 1 ' 
a(i dX a dXAR 0 , 


We can therefore write <j) <2> in the form 


The tensor 

D a p = Z e(3x 0 x p - r 2 S a p) (41.3) 

is called the quadrupole moment of the system. From the definition of D a p it is clear that 
the sum of its diagonal elements is zero: 

D aa = 0. (41.4) 

Therefore the symmetric tensor D a p has altogether five independent components. With the 
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aid of D a p, we can write 

or, performing the differentiation. 


D «P d 2 f l) 
6 dX a dXp [ r 0 / 


d 2 1_ 3X a Xp ^ ^ 

R 0 “ R 0 5 “ “^p 

and using the fact that 8 a p D a p = £> ace = 0, 


(41.5) 


(2) 

2/? 0 3 ’ 


(41.6) 


Like every symmetric three-dimensional tensor, the tensor D a p can be brought to principal 
axes. Because of (41.4), in general only two of the three principal values will be independent. 
If it happens that the system of charges is symmetric around some axis (the z axis)t then this 
axis must be one of the principal axes of the tensor D a p, the location of the other two axes 
in the x, y plane is arbitrary, and the three principal values are related to one another: 


D»=D y y = -%D a . (41.7) 

Denoting the component D zz by D (in this case it is simply called the quadrupole moment), 
we get for the potential 

0<2) = 4% (3 C ° S2 ^ “ 1} = 2% P2 (C ° S 6) ’ (41.8) 

where 6 is the angle between R 0 and the z axis, and P 2 is a Legendre polynomial. 

Just as we did for the dipole moment in the preceding section, we can easily show that the 
quadrupole moment of a system does not depend on the choice of the coordinate origin, if 
both the total charge and the dipole moment of the system are equal to zero. 

? In similar fashion we could also write the succeeding terms of the expansion (41.1). The 
/’th term of the expansion defines a tensor (which is called the tensor of the 2 z -pole moment) 
of rank /, symmetric in all its indices and vanishing when contracted on any pair of indices; 
it can be shown that such a tensor has 21 + 1 independent components.^ 

We shall express the general term in the expansion of the potential in another form, by 
using the well-known formula of the theory of spherical harmonics 


1 R ° “ r 1 = ^ + r 2 -2r«„co s J = P ' <COS * ) - 


(41.9) 


where % is the angle between R 0 and r. We introduce the spherical angles 0, O and 6, 0, 
formed by the vectors R 0 and r, respectively, with the fixed coordinate axes, and use the 
addition theorem for the spherical harmonics: 


t We are assuming a symmetry axis of any order higher than the second. 

t Such a tensor is said to be irreducible. The vanishing on contraction means that no tensor of lower rank 
in be formed from the components. 
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P, (cos X ) = j_, li + \™ I), Pl m ' (cos 0) />/ ml (cos 6)e-‘ 


(41.10) 


where the P t m are the associated Legendre polynomials. 
We also introduce the spherical functions t 


^^ m> -°- 

<p) = (-l)'- m Yi M ,. (41.11) 

Then the expansion (41.9) takes the form: 

Carrying out this expansion in each term of (40.1), we finally get the following expression 
for the /’th term of the expansion of the potential: 




(41.12) 




(41.13) 


The set of 21 + 1 quantities form the 2'-pole moment of the system of charges. 

The quantities Q™ defined in this way are related to the components of the dipole 
moment vector d by the formulas 

Q™ = id z , GiV = + -j^(d x ± id y ). (41.14) 

The quantities Q™ are related to the tensor components D a p by the relations 


Go® = - \d zz , Q™ = ± ^(D xz ± iD yz ), 
Q$ = ~^( Dx *- D yy ±2iD *y)- 


(41.15) 


PROBLEM 

Determine the quadrupole moment of a uniformly charged ellipsoid with respect to its centre. 
Solution: Replacing the summation in (41.3) by an integration over the volume of the ellipsoid, we have: 

D xx = p JJJ (2x 2 -y 2 - z 2 )dxdy dz, etc. 

t In accordance with the definition used to quantum mechanics. 
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Let us choose the coordinate axes along the axes of the ellipsoid with the origin at its centre; from symmetry 
considerations it is obvious that these axes are the principal axes of the tensor D a «. By means of the 
transformation 

x = x'a, y = y'b, z = z'c 
the integration over the volume of the ellipsoid 



is reduced to integration over the volume of the unit sphere 
x 2 + y ' 2 + z' 2 =l. 

As a result we obtain: 

D xx = f ( 2 « 2 ~b 2 -c 2 ), D yy = ^(2 b 2 - a 2 - c 2 ), 

D u =^{2c 2 -a 2 -b 2 ), 

where e = (Anri)abcQ is the total charge of the ellipsoid. 

§ 42. System of charges in an external field 

We now consider a system of charges located in an external electric field. We designate 
the potential of this external field by 0(r). The potential energy of each of the charges is 
e a<t>( r a\ a nd the total potential energy of the system is 

U='Ze a <p( rj. (42.1) 

We introduce another coordinate system with its origin anywhere within the system of 
charges; r a is the radius vector of the charge e a in these coordinates. 

Let us assume that the external field changes slowly over the region of the system of 
charges, i.e. is quasiuniform with respect to the system. Then we can expand the energy U 
in powers of r a : 

U = l/ 0) + l/ l) + lA 2) + ...; (42.2) 

in this expansion the first term is 

t/(0) = 0o I e a , (42.3) 

where (p 0 is the value of the potential at the origin. In this approximation, the energy of the 
system is the same as it would be if all the charges were located at one point (the origin). 
The second term in the expansion is 

l/ l) = (grad 0) o • Z e a r a 

Introducing the field intensity E 0 at the origin and the dipole moment d of the system, we 
have 

f/ 1) = -d-E 0 . (42.4) 

The total force acting on the system in the external quasiuniform field is, to the order we 
are considering. 
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F = E 0 I + [V(d • E)] 0 . 

If the total charge is zero, the first term vanishes, and 

F = (d V)E, (42.5) 

i.e. the force is determined by the derivatives of the field intensity (taken at the origin). The 
total moment of the forces acting on the system is 

K = I (r fl x e a E 0 ) = d x E 0 , (42.6) 


i.e. to lowest order it is determined by the field intensity itself. 

Let us assume that there are two systems, each having total charge zero, and with dipole 
moments d, and d 2 , respectively. Their mutual distance is assumed to be large in comparison 
with their internal dimensions. Let us determine their potential energy of interaction, U. To 
do this we regard one of the systems as being in the field of the other. Then 


U = - d 2 E,. 

where E, is the field of the first system. Substituting (40.8) for Ej, we find: 


U = (d t d 2 )/? 2 -3(d t ■ R)(d 2 • R) (42 ?) 

where R is the vector separation between the two systems. 

For the case where one of the systems has a total charge different from zero (and equal to 
e), we obtain similarly 


U = e (42.8) 

R 3 

where R is the vector directed from the dipole to the charge. 

The next term in the expansion (42.1) is 


U^ = ^ex a x p 


d 2 <Po 

dx a dxp ’ 


Here, as in § 41, we omit the index numbering the charge; the value of the second 
derivative of the potential is taken at the origin; but the potential 0 satisfies Laplace’s 
equation. 


Therefore we can write 


d 2 <p 

l^ 2 


$ap 


d 2 (p 

dx a dxf 


= 0 . 


(J(2) 


1 d 2 (j) 0 , 

2 dx a dxp 


or, finally, 


( 2) D aP d 2 <p {) 

6 dx a dxp ’ 


(42.9) 
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The general term in the series (42.2) can be expressed in terms of the 2 z -pole moments 
defined in the preceding section. To do this, we first expand the potential <j>( r) in 
spherical harmonics; the general form of this expansion is 

0(r) = Z r‘ 1[ t ^2777 abnYtn (®. 0), (42.10) 

where r, 6, <j> are the spherical coordinates of a point and the a /m are constants. Forming the 
sum (42.1) and using the definition (41.13), we obtain: 

U^ = J,_a lm Q"\ (42.11) 


§ 43. Constant magnetic field 

Let us consider the magnetic field produced by charges which perform a finite motion, in 
which the particles are always within a finite region of space and the momenta also always 
remain finite. Such a motion has a “stationary” character, and it is of interest to consider the 
time average magnetic field H, produced by the charges; this field will now be a function 
only of the coordinates and not of the time, that is, it will be constant. 

In order to find equations for the average magnetic field H, we take the time average of 
the Maxwell equations 

div H = 0, curl H = — ^ + — j. 

c dt c 


The first of these gives simply 

div H = 0. (43.1) 

In the second equation the average value of the derivative dE/dt, like the derivative of any 
quantity which varies over a finite range, is zero (cf. the footnote on p. 90). Therefore the 
second Maxwell equation becomes 


curl H = 


These two equations determine the constant field H. 

We introduce the average vector potential A in accordance with 


curl A = 


H. 


(43.2) 


We substitute this in equation (43.2). We find 

grad div A - AA = “ j . 

But we know that the vector potential of a field is not uniquely defined, and we^can impose 
an arbitrary auxiliary condition on it. On this basis, we choose the potential A so that 

div A = 0. (43.3) 

Then the equation defining the vector potential of the constant magnetic field becomes 
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It is easy to find the solution of this equation by noting that (43.4) is completely analogous 
to the Poisson equation (36.4) for the scalar potential of a constant electric field, where in 
place of the charge density Q we here have the current density j/c. By analogy with the 
solution (36.8) of the Poisson equation, we can write 




(43.5) 


where R is the distance from the field point to the volume element dV. 

In formula (43.5) we can go over from the integral to a sum over the charges, by substituting 
in place of j the product Q\, and recalling that all the charges are pointlike. In this we must 
keep in mind that in the integral (43.5), R is simply an integration variable, and is therefore 
not subject to the averaging process. If we write in place of the integral 


J dV , the sum X C ^ V —, 

then R a here are the radius vectors of the various particles, which change during the motion 
of the charges. Therefore we must write 



(43.6) 


where we average the whole expression under the summation sign. 

Knowing A, we can also find the magnetic field, 

H = curl A = curl i J dV. 

The curl operator refers to the coordinates of the field point. Therefore the curl can be 
brought under the integral sign and j can be treated as constant in the differentiation. 
Applying the well-known formula 

curl/a = / curl a + grad/x a. 


where/and a are an arbitrary scalar and vector, to the product j. HR, we get 


curl = grad x j = 


R 3 ’ 


and consequently, 

H = IP-^rfV (43.7) 

c J R 3 

(the radius vector R is directed from dV to the field point). This is the law oi Biot and Savart. 


§ 44. Magnetic moments 

Let us consider the average magnetic field produced by a system of charges in stationary 
motion, at large distances from the system. 

We introduce a coordinate system with its origin anywhere within the system of charges, 
just as we did in § 40. Again we denote the radius vectors of the various charges by r a , and 
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the radius vector of the point at which we calculate the field by R 0 . Then R 0 - r a is the radius 
vector from the charge e a to the field point. According to (43.6), we have for the vector 
potential: 


A = 


±z 

c 


e a \ a 

IR 0 - r a l' 


(44.1) 


As in § 40, we expand this expression in powers of r a . To terms of first order (we omit 
the index a), we have 


In the first term we can write 


E e\ = -j- £ e x. 
at 

But the average value of the derivativeof a quantity changing within a finite interval (like 
X er) is zero. Thus there remains for A the expression 


A =-'K r ■ v i) = ^ ev<r ■ R ° , • 

We transform this expression as follows. Noting that v = r, we can write (remembering 
that R 0 is a constant vector) 

E <?(R 0 ’ r > v = ^ ^ ^ r ( r ■ R 0 ) + ^ £ c[v(r • R 0 ) - r(v • R 0 )] . 

Upon substitution of this expression in A, the average of the first term (containing the time 
derivative) again goes to zero, and we get 

S= 2^" l4v(l " Ro)_r(v Ro)] • 

We introduce the vector 


= ^2,erxv. 


(44.2) 


which is called the magnetic moment of the system. Then we get for A: 


A = 



X m 


(44.3) 


Knowing the vector potential, it is easy to find the magnetic field. With the aid of the 
formula 


curl (a x b) = (b ■ V)a - (a • V) b + a div b - b div a. 


H = curl A = curl [ - X R ° ] = ^div - A*-. V) ^2. 

I *o J *o 3 R 0 3 ' 


we find 
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Furthermore, 
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div^f = R 0 grad-^- +JjdivRo =0 


( ^ • V) ( - • V)Ro + Ro(« ' v > = fr ~ 3R ° ( ^' R ° -- 

/?o RjJ R 0 K o 


Thus, 


- 3n(m ■ tl) - m 
—^ ' 


(44.4) 


where n is again the unit vector along R () . We see that the magnetic field is expressed in 
terms of the magnetic moment by the same formula by which the electric field was expressed 
in terms of the dipole moment [see (40.8)]. 

If all the charges of the system have the same ratio of charge to mass, then we can write 


rxv = £ mx x v. 

2c 2 me 

If the velocities of all the charges v « c then mv is the momentum p of the charge and we 
get 


(44.5) 


where M = Y, r x p is the mechanical angular momentum of the system. Thus in this case, 
the ratio of magnetic moment to the angular momentum is constant and equal to e/2 me. 


PROBLEM 

Find the ratio of the magnetic moment to the angular momentum for a system of two charges (velocities 

Solution: Choosing the origin of coordinates as the centre of mass of the two particles we have m,r, + 
m 2 r 2 = 0 and pi = - p 2 = p, where p is the momentum of the relative motion. With the aid of these relations, 
we find 

— _ J_ (_£!_. _£2_I m \ m 2 jyj 

m ~2 c[ m j mi)mi+m 2 


§ 45. Larmor’s theorem 

Let us consider a system of charges in an external constant uniform magnetic field. The 
time average of the force acting on the system. 




is zero, as is the time average of the time derivative of any quantity which varies over a finite 
range. The average value of the moment of the forces is 
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K = I^(rx(vx H)) 

and is different from zero. It can be expressed in terms of the magnetic moment of the 
system, by expanding the vector triple product: 

K = Z f Mr • H) - H(v • r)} = If jv(r ■ H) - ± H £ r 2 J. 

The second term gives zero after averaging, so that 

K = E j v(r • H) = JL E e{\(r ■ H) - r(v • H)} 

[the last transformation is analogous to the one used in deriving (44.3)], or finally 

K = mxH. (45.1) 

We call attention to the analogy with formula (42.6) for the electrical case. 

The Lagrangian for a system of charges in an external constant uniform magnetic field 
contains (compared with the Lagrangian for a closed system) the additional term 

L h = I^A-v = I^(Hxr)’V = Z^(rxv)-H (45.2) 

[where we have used the expression (19.4) for the vector potential of a uniform field]. 
Introducing the magnetic moment of the system, we have: 

L h = m • H. (45.3) 

We call attention to the analogy with the electric field; in a uniform electric field, the 
Lagrangian of a system of charges with total charge zero contains the term 

L e = d ■ E , 

which in that case is the negative of the potential energy of the charge system (see § 42). 

We now consider a system of charges performing a finite motion (with velocities v« c) 
in the centrally symmetric electric field produced by a certain fixed charge. We transform 
from the laboratory coordinate system to a system rotating uniformly around an axis passing 
through the fixed particle. From the well-known formula, the velocity v of the particle in the 
new coordinate system is related to its velocity v' in the old system by the relation 

v' = v + £2 x r, 

where r is the radius vector of the particle and £2 is the angular velocity of the rotating co¬ 
ordinate system. In the fixed system the Lagrangian of the system of charges is 

L = X^-[/, 

where U is the potential energy of the charges in the external field plus the energy of their 
mutual interactions. The quantity U is a function of the distances of the charges from the 
fixed particle and of their mutual separations; when transformed to the rotating system it 
obviously remains unchanged. Therefore in the new system the Lagrangian is 

L = E-|(v + ft xr) 2 -U. 
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Let us assume that all the charges have the same charge-to-mass ratio elm, and set 


Then for sufficiently small H (when we can neglect terms in H 2 ) the Lagrangian becomes: 

L = E^ + iErHxr-v-(/. 

2 2c 

We see that it coincides with the Lagrangian which would have described the motion of the 
charges in the laboratory system of coordinates in the presence of a constant magnetic field 
(see (45.2)). 

Thus we arrive at the result that, in the nonrelativistic case, the behaviour of a system of 
charges all having the same elm, performing a finite motion in a centrally symmetric electric 
field and in a weak uniform magnetic field H, is equivalent to the behaviour of the same 
system of charges in the same electric field in a coordinate system rotating uniformly with 
the angular velocity (45.4). This assertion is the content of the Larmor theorem, and the 
angular velocity Q = eH/2mc is called the Larmor frequency. 

We can approach this same problem from a different point of view. If the magnetic field 
H is sufficiently weak, the Larmor frequency will be small compared to the frequencies of 
the finite motion of the system of charges. Then we may consider the averages, over times 
small compared to the period 2n!il, of quantities describing the system. These new quantities 
will vary slowly in time (with frequency Q). 

Let us consider the change in the average angular momentum M of the system. According 
to a well-known equation of mechanics, the derivative of M is equal to the moment K of the 
forces acting on the system. We therefore have, using (45.1): 


fi = K^x 

dt 


H. 


If the elm ratio is the same for all particles of the system, the angular momentum and 
magnetic moment are proportional to one another, and we find by using formulas (44.5) and 
(45.4): 


j = (45.5) 

This equation states that the vector M (and with it the megnetic moment m) rotates with 
angular velocity -Q around the direction of the field, while its absolute magnitude and the 
angle which it makes with this direction remain fixed. (This motion is called the Larmor 
precession.) 
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§ 46. The wave equation 

The electromagnetic field in vacuum is determined by the Maxwell equations in which we 
must set p = 0, j = 0. We write them once more: 

curl E = - ^ div H = 0, (46.1) 

curl H = ^ div E = 0. (46.2) 

These equations possess nonzero solutions. This means that an electromagnetic field can 
exist even in the absence of any charges. 

Electromagnetic fields occurring in vacuum in the absence of charges are called 
electromagnetic waves. We now take up the study of the properties of such waves. 

First of all we note that such fields must necessarily be time-varying. In fact, in the 
contrary case, dH/dt = dE/dt = 0 and the equations (46.1) and (46.2) go over into the 
equations (36.1), (36.2) and (43.1), (43.2) of a constant field in which, however, we now 
have p = 0, j = 0. But the solution of these equations which is given by formulas (36.8) and 
(43.5) becomes zero for p = 0, j = 0. 

We derive the equations determining the potentials of electromagnetic waves. 

As we already know, because of the ambiguity in the potentials we can always subject 
them to an auxiliary condition. For this reason, we choose the potentials of the electromagnetic 
wave so that the scalar potential is zero: 

•P = 0. (46.3) 

Then 


1 dA 

E= ~C~df' H = curl A - ( 46 - 4 ) 

Substituting these two expressions in the first of equations (46.2), we get 

curl curl A = - AA + grad div A = - ■ (46 51 

c 2 dt 2 

Despite the fact that we have already imposed one auxiliary condition on the potentials, 
the potential A is still not completely unique. Namely, we can add to it the gradient of an 
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§ 46 

arbitrary function which does not depend on the time (meantime leaving <j> unchanged). In 
particular, we can choose the potentials of the electromagnetic wave so that 

div A = 0. (46.6) 

In fact, substituting for E from (46.4) in div E = 0, we have 

div = -j- div A = 0 
dt dt 

that is, div A is a function only of the coordinates. This function can always be made zero 
by adding to A the gradient of a suitable time-independent function. 

The equation (46.5) now becomes 

AA - \ ^ = 0. (46.7) 

c 2 dt 2 

This is the equation which determines the potentials of electromagnetic waves. It is called 
the d’Alembert equation, or the wave equation 
Applying to (46.7) the operators curl and d/dt, we can verify that the electric and magnetic 
fields E and H satisfy the same wave equation. 

We repeat the derivation of the wave equation in four-dimensional form. We write the 
second pair of Maxwell equations for the field in the absence of charges in the form 



(This is equation (30.2) with / = 0.) Substituting F ik , expressed in terms of the potentials, 

F ik _ _ dA' 

~ dx t dx k ’ 


we get 

d 2 A k d 2 A' = Q 
dxjdx k dx k dx k 

We impose on the potentials the auxiliary condition: 

- °- 


(46.8) 


(46.9) 


(This condition is called the Lorentz condition, and potentials that satisfy it are said to be in 
the Lorentz gauge.) Then the first term in (46.8) drops out and there remains 


d 2 A‘ 

dx k dx k 


d 2 A' 
dx k dx‘ 


= 0 . 


(46.10) 


t The wave equation is sometimes written in the form HA = 0, where 


dt 2 


is called the d’Alembertian operator. 
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This is the wave equation expressed in four-dimensional form.t 
In three-dimensional form, the condition (46.9) is: 

1 d<b 

cif ivA = 0 - (4611) 

It is more general than the conditions <p - 0 and div A = 0 that were used earlier; potentials 
that satisfy those conditions also satisfy (46.11). But unlike them the Lorentz condition has 
a relativistically invariant character: potentials satisfying it in one frame satisfy it in any 
other frame (whereas condition (46.6) is generally violated if the frame is changed). 

§ 47. Plane waves 

We consider the special case of electromagnetic waves in which the field depends only on 
one coordinate, say x (and on the time). Such waves are said to be plane. In this case the 
equation for the field becomes 


dt 2 


i<Pf 

dx 2 


0 , 


where by / is understood any component of the vectors E or H. 
To solve this equation, we rewrite it in the form 


fd 


fd d\ 


dx) 

U* dx } 


and introduce new variables 


£=/-—, T] = t+ — 


(47.1) 


so that t - \ (T] + £), X = £ (77 - £). Then 


d 1 1 

f d 


d 1 I 

f d d) 

^ 2 ! 

l* 

dx} 

dl] ~ 2 \ 

[dt dx)’ 


so that the equation for/becomes 



The solution obviously has the form/=/,(£) + f 2 (rj), where/, and / 2 are arbitrary functions. 
Thus 


/=/,('- 7 } (47.2) 


t It should be mentioned that the condition (46.9) still does not determine the choice of the potentials 
uniquely. We can add to A a term grad/ and subtract a term 1/c {dj/dt) from 0, where the function/is not 
arbitrary but must satisfy the wave equation □/= 0. 



§ 47 PLANE WAVES 

Suppose, for example, f 2 = 0, so that 
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Let us clarify the meaning of this solution. In each plane x = const, the field changes with 
the time; at each given moment the field is different for different x. It is clear that the field 
has the same values for coordinates x and times t which satisfy the relation t - (x/c) = const, 
that is, 


x = const + ct. 

This means that if, at some time t = 0, the field at a certain point x in space had some definite 
value, then after an interval of time t the field has that same value at a distance ct along the 
X axis from the original place. We can say that all the values of the electromagnetic field are 
propagated in space along the X axis with a velocity equal to the velocity of light, c. 

Thus, 



represents a plane wave moving in the positive direction along the X axis. It is easy to show 
that 


A (' + c) 


represents a wave moving in the opposite, negative, direction along the X axis. 

In § 46 we showed that the potentials of the electromagnetic wave can be chosen so that 
0 = 0, and div A = 0. We choose the potentials of the plane wave which we are now 
considering in this same way. The condition div A = 0 gives in this case 


since all quantities are independent of y and z. According to (47.1) we then have also 
d 1 A x /dt 2 = 0, that is, dAJdt = const. But the derivative dAJdt determines the electric field, 
and we see that the nonzero component A x represents in this case the presence of a constant 
longitudinal electric field. Since such a field has no relation to the electromagnetic wave, we 
can set A x = 0. 

Thus the vector potential of the plane wave can always be chosen perpendicular to the X 
axis, i.e. to the direction of propagation of that wave. 

We consider a plane wave moving in the positive direction of the X axis; in this wave, all 
quantities, in particular also A, are functions only of t - (x/c). From the formulas 

E = - — H = curl A 

cdt 


we therefore obtain 

E = - — A', H = VxA = v(r - —1 x A' = - — n x A', 
c \ c ) c 


(47.3) 
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where the prime denotes differentiation with respect to t - (x/c) and n is a unit vector along 
the direction of propagation of the wave. Substituting the first equation in the second, we 
obtain 

H = n x E. (47.4) 

We see that the electric and magnetic fields E and H of a plane wave are directed perpendicular 
to the direction of propagation of the wave. For this reason, electromagnetic waves are said 
to be transverse. From (47.4) it is clear also that the electric and magnetic fields of the plane 
wave are perpendicular to each other and equal to each other in absolute value. 

The energy flux in the plane wave, i.e. its Poynting vector is 


and since E • n = 0, 


E), 


Thus the energy flux is directed along the direction of propagation of the wave. Since 


is the energy density of the wave, we can write 

S = cWn, (47.5) 

in accordance with the fact that the field propagates with the velocity of light. 

The momentum per unit volume of the electromagnetic field is Sic 2 . For a plane wave this 
gives (W/c)n. We call attention to the fact that the relation between energy W and momentum 
W/c for the electromagnetic wave is the same as for a particle moving with the velocity of 
light [see (9.9)]. 

The flux of momentum of the field is determined by the components o a p of the Maxwell 
stress tensor (33.3). Choosing the direction of propagation of the wave as the X axis, we find 
that the only nonzero component of T°^ is 

T xx = -o xx = W. (47.6) 

As it must be, the flux of momentum is along the direction of propagation of the wave, and 
is equal in magnitude to the energy density. 

Let us find the law of transformation of the energy density of a plane electromagnetic 
wave when we change from one inertial reference system to another. To do this we start from 
the formula 



(see the problem in § 33) and must substitute 

S' = cW 'cos a', = - W'cos 2 a', 

where a’ is the angle (in the K' system) between the X' axis (along which the velocity V is 
directed) and the direction of propagation of the wave. We find: 
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W=W' 


£ 



(47.7) 


Since W = E 2 /4n = H 2 /4n, the absolute values of the field intensities in the wave transform 
like 4W. 


PROBLEMS 

1. Determine the force exerted on a wall from which an incident plane electromagnetic wave is reflected 
(with reflection coefficient R). 

Solution: The force f acting on unit area of the wall is given by the flux of momentum through this area, 
i.e., it is the vector with components 

fa =-C a pNp-o' a pN li , 

where N is the vector normal to the surface of the wall, and O a p and o' a p are the components of the energy- 
momentum tensors for the incident and reflected waves. Using (47.6), we obtain: 

f = Wn(N ■ n) + h"n'(N - n'). 

From the definition of the reflection coefficient, we have: W’ = RW. Also introducing the angle of 
incidence 6 (which is equal to the reflection angle) and writing out components, we find the normal force 
(“light pressure”) 

f N = W(1 + R) cos 2 6 


and the tangential force 


/, = Ml - R) sin 6 cos ft 


2. Use the Hamilton-Jacobi method to find the motion of a charge in the field of a plane electromagnetic 
wave with vector potential A[t - (x/c)]. 

Solution: We write the Hamilton-Jacobi equation in four-dimensional form: 




The fact that the field is a plane wave means that the A 1 are functions of one independent variable, which 
can be written in the form | = kp r', where /:' is a constant four-vector with its square equal to zero, k,k‘ = 0 
(see the following section). We subject the potentials to the Lorenjz condition 


dA‘ 

dx‘ 


= 0; 


for the variables field this is equivalent to the condition A‘k t = 0. 
We seek a solution of equation (1) in the form 


S = -fx 1 + F(|), 


where /' = (/°, f) is a constant vector satisfying the condition fif = m 2 c 2 (5 = -fx‘ is the solution of the 
Hamilton-Jacobi equation for a free particle with four-momentum p‘ = /'). Substitution in (1) gives the 
equation 
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4- 4,-4 i - 2y^| - —fiA' = 0, 
c z c 

where the constant y = kj l . Having determined F from this equation, we get 

5 = ~ fiX ‘ ~ J fiA ' d S + ^ J A ' A>d ^ ■ (2) 

Changing to three-dimensional notation with a fixed reference frame, we choose the direction of propagation 
of the wave as the x axis. Then £ = ct - x, while the constant y=/° - /*. Denoting the two-dimensional 
vector f y ,f z by k, we find from the condition fj‘ = (/°) 2 - (J 1 ) 2 - k 2 = m 2 c 2 . 


f° +/' = 



r 


We choose the potentials in the gauge in which <j> = 0, while A(£) lies in the yz plane. Then equation (2) takes 
the form: 


According to the general rules {Mechanics, § 47), to determine the motion we must equate the derivatives 
dS/dK, dS/dy to certain new constants, which can be made to vanish by a suitable choice of the coordinate 
and time origins. We thus obtain the parametric equations in 



The generalized momentum P = p + {etc )A and the energy S are found by differentiating the action with 
respect to the coordinates and the time; this gives: 


p y = K y -^A y , Pz =K z -^A z , 



If we average these over the time, the terms of first degree in the periodic function A(|) vanish. We assume 
that the reference system has been chosen so that the particle is at rest in it on the average, i.e. so that its 
averaged momentum is zero. Then 

k = 0, y 2 = m 2 c 2 + e - 1^. 

The final formulas for determining the motion have the form: 

^ z= -#j ^ 
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c, = l+ 2 (A2 - A2 ) rf ^ ; 

Px = t^t(A 2 -a 7 ), Py = ~ ~A y , P Z = -^A Z , 

2yc z c c 

^ =c7+ ^ (a2 - IT) - 


(3) 


(4) 


§ 48. Monochromatic plane waves 

A very important special case of electromagnetic waves is a wave in which the field is a 
simply periodic function of the time. Such a wave is said to be monochromatic. All quantities 
(potentials, field components) in a monochromatic wave depend on the time through a factor 
of the form cos (cot + a). The quantity ft) is called the cyclic frequency of the wave (we shall 
simply call it the frequency). 

In the wave equation, the second derivative of the field with respect to the time is now 
d 2 f/dt 2 - - off so that the distribution of the field in space is determined for a monochromatic 
wave by the equation 


A/ + ^-/=0. (48.1) 

In a plane wave (propagating along the x axis), the field is a function only of t - (x/c). 
Therefore, if the plane wave is monochromatic, its field is a simply periodic function of 
t-(x/c). The vector potential of such a wave is most conveniently written as the real part of 
a complex expression: 

A = Re {A 0 e~ il °° (48.2) 

Here A 0 is a certain constant complex vector. Obviously, the fields E and H of such a wave 
have analogous forms with the same frequency ft). The quantity 


X 


litc 

ft) 


(48.3) 


is called the wavelength; it is the period of variation of the field with the coordinate x at a 
fixed time t. 


The vector 


k = ^ n (48.4) 

c 

(where n is a unit vector along the direction of propagation of the wave) is called the wave 
vector. In terms of it we can write (48.2) in the form 

A = Re {A 0 e' (k ‘ r_< “}, (48.5) 

which is independent of the choice of coordinate axes. The quantity which appears multiplied 
by i in the exponent is called the phase of the wave. 
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So long as we perform only linear operations, we can omit the sign Re for taking the real 
part, and operate with complex quantities as such.t Thus, substituting 


A = A 0 e ,(kr_£a,) 


in (47.3), we find the relation between the intensities and the vector potential of a plane 
monochromatic wave in the form 


E = ik A, H = ik x A. (48.6) 

We now treat in more detail the direction of the field of a monochromatic wave. To be 
specific, we shall talk of the electric field 

E = Re {E 0 <?' rk ' r -“ ) } 

(everything stated below applies equally well, of course, to the magnetic field). The quantity 
E 0 is a certain complex vector. Its square Eq is (in general) a complex number. If the 
argument of this number is - 2 a (i.e. Eq = I E^ I e~ 2ia ), the vector b defined by 

E 0 = be** (48.7) 

will have its square real, b 2 = I E 0 I 2 . With this definition, we write: 

E = Re {bc' (kr -“-«>}. (48.8) 

We write b in the form 


b = b, + ib 2 , 

where b, and b 2 are real vectors. Since b 2 = bf - b\ + 2ib, ■ b 2 must be a real quantity, 
bj • b 2 = 0, i.e. the vectors b, and b 2 are mutually perpendicular. We choose the direction of 
b i as the y axis (and the x axis along the direction of propagation of the wave). We then have 
from (48.8): 

E y = bi cos (at - k • r + a), 

E z = ± b 2 sin (cot - k • r + a), (48.9) 

where we use the plus (minus) sign if b 2 is along the positive (negative) z axis. From (48.9) 
it follows that 

t If two quantities A(f) and B(r) are written in complex form 

A(0 = A 0 c-'“', B(r) = B 0 e im , 

then in forming their product we must first, of course, separate out the real part. But if, as it frequently 
happens, we are interested only in the time average of this product, it can be computed as 

4 Re {AB*}. 

In fact, we have: 


Re A ■ Re B = 4 (A 0 c to ' + Aje" 8 ') • (B 0 c to ' + B' 0 e‘ m ). 
When we average, the terms containing factors e ±2ial vanish, so that we are left with 
Re A • Re B = 4 (A 0 • Bj + Aq • B 0 ) = { Re (A ■ B*). 
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Thus we see that, at each point in space, the electric field vector rotates in a plane 
perpendicular to the direction of propagation of the wave, while its endpoint describes the 
ellipse (48.10). Such a wave is said to be elliptically polarized. The rotation occurs in the 
direction of (opposite to) a right-hand screw rotating along the x axis, if we have the plus 
(minus) sign in (48.9). 

If h x = b 2 , the ellipse (48.10) reduces to a circle, i.e. the vector E rotates while remaining 
constant in magnitude. In this case we say that the wave is circularly polarized. The choice 
of the directions of the y and z axes is now obviously arbitrary. We note that in such a wave 
the ratio of the y and z components of the complex amplitude E 0 is 

= ± i (48.11) 

E()y 

for rotation in the same (opposite) direction as that of a right-hand screw right and left 
polarizations).! 

Finally, if h x or h 2 equals zero, the field of the wave is everywhere and always parallel (or 
antiparallel) to one and the same direction. In this case the wave is said to be linearly 
polarized, or plane polarized. An elliptically polarized wave can clearly be treated as the 
superposition of two plane polarized waves. 

Now let us turn to the definition of the wave vector and introduce the four-dimensional 
wave vector with components 

(48.12) 

That these quantities actually form a four-vector is obvious from the fact that we get a scalar 
the phase of the wave) when we nultiply by x': 

kpc* = at - k • r. (48.13) 

From the definitions (48.4) and (48.12) we see that the square of the wave four-vector is 
zero: 

k% = 0. (48.14) 

This relation also follows directly from the fact that the expression 
A = A 0 e~ lk ‘ x ' 

must be a solution of the wave equation (46.10). 

As is the case for every plane wave, in a monochromatic wave propagating along the x 
axis only the following components of the energy-momentum tensor are different from zero 
(see § 47): 

Y 00 _ y’Ol _ y’ll _ jy 

By means of the wave four-vector, these equations can be written in tensor form as 

T ik =^k i k k . (48.15) 

or 

t We assume that the coordinate axes form a right-handed system. 
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Finally, by using the law of transformation of the wave four-vector we can easily treat the 
so-called Doppler effect —the change in frequency to of the wave emitted by a source 
moving with respect to the observer, as compared to the “true” frequency to 0 of the samp 
source in the reference system (K f) ) in which it is at rest. 

Let V be the velocity of the source, i.e. the velocity of the K 0 system relative to K. 
According to the general formula for transformation of four-vectors, we have: 


k m) = 



(the velocity of the K system relative to K 0 is - V). Substituting k° = ale, k 1 = k cos a = 
co/c cos a , where a is the angle (in the K system) between the direction of emission of the 
wave and the direction of motion of the source, and expressing to in terms of (%, we obtain: 



(48.16) 


This is the required formula. For V « c, and if the angle a is not too close to nil, it gives: 

to= to 0 ^l + ^cos aj. (48.17) 

For a = nil, we have: 

®= ®o Jl --jx = fl>o (l - (48.18) 

in this case the relative change in frequency is proportional to the square of the ratio V7c. 


PROBLEMS 

1. Determine the direction and magnitude of the axes of the polarization ellipse in terms of the complex 
amplitude E 0 . 

Solution: The problem consists in determining the vector b = bj + ib 2 , whose square is real We have from 
(48.7): 

or E 0 - Eq = b 2 + b 2 , E 0 x Eo = - 2/b| x b 2 , (1) 

b, 2 • bl = A 2 + B 2 , bfa = AB sin S, 

where we have introduced the notation 

for the absolute values of E 0y and E 0z and for the phase difference S between them. Then 


lb i 2 = t]A 2 + B 2 + 2AB sin S ± -^A 2 + B 2 - 2AB sin S, 


(2) 
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from which we get the magnitudes of the semiaxes of the polarization ellipse. 

To determine their directions (relative to the arbitrary initial axes y and z) we start from the equality 


Re {(E 0 • b])(Eo • b 2 )} = 0, 

which is easily verified by substituting E 0 = (b, + i'b 2 ) e' 1 ". Writing out this equality in the y, z coordinates, 
we get for the angle 6 between the direction of b! and the y axis: 


The direction of rotation of the field is determined by the sign of the x component of the vector b! x b 2 - 
Taking its expression from (1) 


2i(b, x b 2 ) x = E 0z E‘ 0y - E 0z E 0y = I E 0y I' 


&)-&)} 


we see that the direction of b, x b 2 (whether it is along or opposite to the positive direction of the x axis), 
and the sign of the rotation (whether in the same direction, or opposite to the direction of a right-hand screw 
along the x axis) are given by the sign of the imaginary part of the ratio E 0z /E 0y (plus for the first case and 
minus for the second). This is a generalization of the rule (48.11) for the case of circular polarization. 

2. Determine the motion of a charge in the field of a plane monochromatic linearly polarized wave. 

Solution: Choosing the direction of the field E of the wave as the y axis, we write: 


E y = E = E 0 cos a£,, A y =A = - sin col; 

(£ = t - x/c). From formulas (3) and (4) of problem 2, § 47, we find (in the reference system in which 
the particle is at rest on the average) the following representation of the motion in terms of the parameter 
T] =coq): 


P 2 p 2 pp n 

Px = - ~r~T cos 2t), p y =^~ sini), Pz =0. 

4 ym 1 a 

The charge moves in the x, y plane in a symmetric figure-8 curve with its longitudinal 
axis. During a period of the motion, rj varies from 0 to In. 

3. Determine the motion of a charge in the field of a circularly polarized wave. 
Solution: For the field of the wave we have: 

E y = E 0 cos E z - E 0 sin <a$. 


A^-^sinn* 4=^cos< 


along the y 


The motion is given by the formulas: 


128 


ELECTROMAGNETIC WAVES 



Thus the charge moves in the y, z plane along a circle of radius ecE 0 lyaf with a momentum having the 
constant magnitude p = eE^a; at each instant the direction of the momentum p is opposite to the direction 
of the magnetic field H of the wave. 

§ 49. Spectral resolution 

Every wave can be subjected to the process of spectral resolution, i.e. can be represented 
as a superposition of monochromatic waves with various frequencies. The character of this 
expansion varies according to the character of the time dependence of the field. 

One category consists of those cases where the expansion contains frequencies forming a 
discrete sequence of values. The simplest case of this type arises in the resolution of a purely 
periodic (though not monochromatic) field. This is the usual expansion in Fourier series; it 
contains the frequencies which are integral multiples of the “fundamental” frequency tty = 
2n/T, where T is the period of the field. We write it in the form 

/= ^J n e- i(0ont ’ (49.1) 

(where/is any of the quantities describing the field). The quantities/, are defined in terms 
of the function/by the integrals 

a=7 < 49 - 2 > 


Because/(f) must be real. 


f-n =/„*. (49.3) 

In more complicated cases, the expansion may contain integral multiples (and sums of 
integral multiples) of several different incommensurable fundamental frequencies. 

When the sum (49.1) is squared and averaged over the time, the products of terms with 
different frequencies give zero because they contain oscillating factors. Only terms of the 
form //„ = I /„ I 2 remain. Thus the average of the square of the field, i.e. the average 
intensity of the wave, is the sum of the intensities of its monochromatic components: 

7 1 = l/„l 2 = 2 £ l/l 2 . (49.4) 

(where it is assumed that the average of the function/over a period is zero, i.e./ 0 = / = 0). 

Another category consists of fields which are expandable in a Fourier integral containing 
a continuous distribution of different frequencies. For this to be possible, the function/t) 
must satisfy certain definite conditions; usually we consider functions which vanish for t -> 
± Such an expansion has the form 


§ 50 PARTIALLY POLARIZED LIGHT 

where the Fourier components are given in terms of the function fit) by the integrals 
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fa = J fO)e‘ 


(49.6) 


Analogously to (49.3), 


f-a =fa- 


(49.7) 


Let us express the total intensity of the wave, i.e. the integral of/ 2 over all time, in terms 
of the intensity of the Fourier components. Using (49.5) and (49.6), we have: 


or, using (49.7), 



do 

' 



J ]fj2 


do 

2n 



(49.8) 


§ 50. Partially polarized light 

Every monochromatic wave is, by definition, necessarily polarized. However we usually 
have to deal with waves which are only approximately monochromatic, and which contain 
frequencies in a small interval Ato. We consider such a wave, and let o be some average 
frequency for it. Then its field (to be specific we shall consider the electric field E) at a fixed 
point in space can be writen in the form 

EoMc-' 6 *, 

where the complex amplitude E 0 (f) is some slowly varying function of the time (for a strictly 
monochromatic wave E 0 would be constant). Since E 0 determines the polarization of the 
wave, this means that at each point of the wave, its polarization changes with time, such a 
wave is said to be partially polarized. 

The polarization properties of electromagnetic waves, and of light in particular, are observed 
experimentally by passing the light to be investigated through various bodiest and then 
observing the intensity of the transmitted light. From the mathematical point of view this 
means that we draw conclusions concerning the polarization properties of the light from the 
values of certain quadratic functions of its field. Here of course we are considering the time 
averages of such functions. 

Quadratic functions of the field are made up of terms proportional to the products E a Ep, 
E* a E*p or E a E*p. Products of the form 

E a E p = E 0o E 0 pe~ 2im , E* a E*p = E^E^e 21 " 1 , 
which contain the rapidly oscillating factors e ±2,0>t give zero when the time average is taken. 
The products E a E*p = E 0a Elp do not contain such factors, and so their averages are not 

t For example, through a Nicol prism. 
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zero. Thus we see that the polarization properties of the light are completely characterized 
by the tensor 


J a p = E {)a Elp • (50.1) 

Since the vector E 0 always lies in a plane perpendicular to the direction of the wave, the 
tensor J ap has altogether four components (in this section the indices a, (5 are understood to 
take on only two values: a, 1 3=1,2, corresponding to the y and z axes; the x axis is along 
the direction of propagation of the wave). 

The sum of the diagonal elements of the tensor J ap (we denote it by J) is a real quantity— 
the average value of the square modulus of the vector E 0 (or E): 


J = J m = E 0 • ES . (50.2) 

This quantity determines the intensity of the wave, as measured by the energy flux density. 
To eliminate this quantity which is not directly related to the polarization properties, we 
introduce in place of J ap the tensor 

J afS 

Pap = ~j-, (50.3) 

for which p aa = 1; we call it the polarization tensor. 

From the definition (50.1) we see that the components of the tensor J ap , and consequently 
also p ap , are related by 

Pap = Ppa (50.4) 

(i.e. the tensor is hermitian). Consequently the diagonal components p n and p 22 are real 
(with p n + p 22 - 1) while p 21 = pi* 2 . Thus the polarization is characterized by three real 
parameters. 

Let us study the conditions that the tensor p ap must satisfy for completely polarized light. 
In this case E 0 = const, and so we have simply 

JaP = JPap ~ E()a E {)p (50.5) 

(without averaging), i.e. the components of the tensor can be written as products of components 
of some constant vector. The necessary and sufficient condition for this is that the determinant 
vanish: 


I Pap I = P 11 P 22 - P 12 P 21 = 0. (50.6) 

The opposite case is that of unpolarized or natural light. Complete absence of polarization 
means that all directions (in the y z plane) are equivalent. In other words the polarization 
tensor must have the form: 

P a p = {S ap . (50.7) 

The determinant is I p ap I = L. 

In the general case of arbitrary polarization the determinant has values between 0 and T f 

t The fact that the determinant is positive for any tensor of the form (50.1) is easily seen by considering 
the averaging, for simplicity, as a summation over discrete values, and using the well-known algebraic 
inequality 


\jLx a y b \ 2 <ZlxJ 2 Zly t l 2 . 
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lAtfl = i(l -P 1 )- (50.8) 

It runs from the value 0 for unpolarized to 1 for polarized light. 

An arbitrary tensor p a p can be split into two parts—a symmetric and an antisymmetric 
part. Of these, the first 


S a p - 2 (P«/i + P[)a ) 

is real because of the hermiticity of p a p. The antisymmetric part is pure imaginary. Like any 
antisymmetric tensor of rank equal to the number of dimensions, it reduces to a pseudo¬ 
scalar (see the footnote on p. 18): 

2 (Pap ~ Ppa)= ~^ e aP A ’ 

where A is a real pseudoscalar, e a p is the unit antisymmetric tensor (with components e n 
= -e 2 i = 1). Thus the polarization tensor has the form: 

Pap = Sap - ^ e ap A , S a p = Sp a , (50.9) 

i.e. it reduces to one real symmetric tensor and one pseudoscalar. 

For a circularly polarized wave, the vector E 0 = const, where 

E 02 = ± iE (n . 

It is easy to see that then S a p = &«P • while A = ± 1. On the other hand, for a linearly 
polarized wave the constant vector E 0 can be chosen to be real, so that A = 0. In the general 
case the quantity A may be called the degree of circular polarization; it runs through values 
from +1 to -1, where the limiting values correspond to right- and left-circularly polarized 
waves, respectively. 

The real symmetric tensor Sap, like any symmetric tensor, can be brought to principal 
axes, with different principal values which we denote by A, and A 2 . The directions of the 
principal axes are mutually perpendicular. Denoting the unit vectors along these directions 
by n (1) and n <2) , we can write S a p in the form: 

S a p = A, + A 2 = 1. (50.10) 

The quantities A, and A 2 are positive and take on values from 0 to 1. 

Suppose that A = 0, so that p a p = S a p. Each of the two terms in (50.10) has the form of a 
product of two components of a constant vector (^/A, n (l) or ^/A 2 n <2) ). In other words, 
each of the terms corresponds to linearly polarized light. Furthermore, we see that there is 
no term in (50.10) containing products of components of the two waves. This means that the 
two parts can be regarded as physically independent of one another, or, as one says, they are 
incoherent. In fact, if two waves are independent, the average value of the product E^’E^ 
is equal to the product of the averages of each of the factors, and since each of them is zero. 


E™Ef = 0. 
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Thus we arrive at the result that in this case (A - 0) the partially polarized light can be 
represented as a superposition of two incoherent waves (with intensities proportional to 
and A 2 ), linearly polarized along mutually perpendicular directions.t (In the general case of 
a complex tensor p a p one can show that the light can be represented as a superposition of 
two incoherent elliptically polarized waves, whose polarization ellipses are similar and 
mutually perpendicular (see problem 2 ).) 

Let 0 be the angle between the axis 1 (the y axis) and the unit vector n (1) ; then 

n (1) = (cos 0 , sin 0 ), n <2) = (-sin 0 , cos 0 ). 

Introducing the quantity / = X\ - Aa (assume A\ > A?), we write the components of the tensor 
(50.10) in the following form: 


1 (\ + / cos 20 / sin 20 ") 

^ 2 ^ /sin 20 1 -/cos 20 / 


(50.11) 


Thus, for an arbitrary choice of the axes y and z, the polarization properties of the wave can 
be characterized by the following three real parameters: A —the degree of circular polarization, 
/—the degree of maximum linear polarization, and 0 —the angle between the direction n (1) 
of maximum polarization and the y axis. 

In place of these parameters one can choose another set of three parameters: 

= l sin 20, & - A, £3 = / cos 20 (50.12) 


(the Stokes parameters). The polarization tensor is expressed in terms of them as 



1 + £3 

5 . + 


l-«3 / 


(50.13) 


All three parameters run through values from -1 to +1. The parameter £3 characterizes the 
linear polarization along the y and z axes: the value £ 3=1 corresponds to complete linear 
polarization along the y axis, and £3 = -1 to complete polarization along the z axis. The 
parameter ^ characterizes the linear polarization along directions making an angle of 45° 
with the y axis: the value = 1 means complete polarization at an angle 0 = 7d4, while 
£1 = -I means complete polarization at 0 = -nl4.$ 

The determinant of (50.13) is equal to 


lp^l = i(l-£, 2 -£f-£3 2 )- (50-14) 

Comparing with (50.8), we see that 




(50.15) 


t The determinant I S a p I = suppose that A] > A 2 ; then the degree of polarization, as defined in (50.8), 
is P = 1 - 2A2. In the present case (A = 0) one frequently characterizes the degree of polarization by using 
the depolarization coefficient, defined as the ratio A^Ap 

% For a completely elliptically polarized wave with axes of the ellipse b[ and b 2 (see § 48), the Stokes 
parameters are: 

£ 1 = 0 , £,= + 2 bff 2 , ^=b1-b\. 

Here the y axis is along b 1; while the two signs in <J 2 correspond to directions of b 2 along and opposite to 
the direction on the z axis. 
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Thus, for a given overall degree of polarization P, different types of polarization are possible, 
characterized by the values of the three quantities <jj 2 , <jj 2 , £ 3 , the sum of whose squares is 
fixed; they form a sort of vector of fixed length. 

We note that the quantities = A and = / are invariant under Lorentz 

transformations. This remark is already almost obvious from the very meaning of these 
quantities as degrees of circular and linear polarization.t 


PROBLEMS 

1. Resolve an arbitrary partially polarized light wave into its “natural” and “polarized” parts. 

Solution: This resolution means the representation of the tensor J a p in the form 

The first term corresponds to the natural, and the second to the polarized parts of the light. To determine the 
intensities of the parts we note that the determinant 

\J ap -\J w d ap \ = \E£E$'\ = 0. 

Writing J a p = J pa p in the form (50.13) and solving the equation, we get 
/ n) = 7(l -P). 

The intensity of the polarized part is = IE{, p) l 2 = J - J (n) = JP. 

The polarized part of the light is in general an elliptically polarized wave, where the directions of the axes 
of the ellipse coincide with the principal axes of the tensor S a p. The lengths and b 2 of the axes of the 
ellipse and the angle tj> formed by the axis b ( and the y axis are given by the equations: 

bf + bj = JP, 2b = JP% 2 , tan 2 <j> = |k. 

2. Represent an arbitrary partially polarized wave as a superposition of two incoherent elliptically 
polarized waves. 

Solution: For the hermitian tensor p a p the “principal axes” are determined by two unit complex vectors 
n(n • n* = 1), satisfying the equations 

Pap np = fai a . (1) 

The principal values A, and A2 are the roots of the equation 
\p a p-k8 a p\ = 0. 

Multiplying (1) on both sides by n* a , we have: 

h = p a pn a tip =±\E 0a n a \ 2 , 

t For a direct proof, we note that since the field of the wave is transverse in any reference frame, it is clear 
from the start that the tensor p a p remains two-dimensional in any new frame. The transformation of p a p into 
Pap leaves unchanged the sum of absolute squares p„p p,,p (in fact, the form of the transformation does not 
depend on the specific polarization properties of the light, while for a completely polarized wave this sum 
is 1 in any reference system). Because this transformation is real, the real and imaginary parts of the tensor 
Pap (50.9) transform independently, so that the sums of the squares of the components of each separately 
remain constant, and are expressed in terms of l and A. 
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Pap n f = P*ap n f' = ^2 n®* 


for the first by n®* and for the second by , taking the difference of the results and using the hermiticity 
of p a p, we get: 

(A, - X 2 )nlpn^ = 0. 

It then follows that n (l) • n (2) * = 0, i.e. the unit vectors n (1) and n (2) are mutually orthogonal. 

The expansion of the wave is provided by the formula 

Pa p=X^n^ + l 2 n?nf. 

One can always choose the complex amplitude so that, of the two mutually perpendicular components, one 
is real and the other imaginary (compare § 48). Setting 

n, (1) = = ib 2 

(where now b t and b 2 are understood to be normalized by the condition b 2 + b\ = 1), we get from the 
equation n (1) • n (2) * = 0: 

n?=ib 2 , nf = b x . 

We then see that the ellipses of the two elliptically polarized vibrations are similar (have equal axis ratio), 
and one of them is turned through 90° relative to the other. 

3. Find the law of transformation of the Stokes parameters for a rotation of they y, z axes through and 
angle 0. 

Solution: The law is determined by the connection of the Stokes parameters to the components of the 
two-dimensional tensor in the yz plane, and is given by the formulas 

£1 = £1 cos 20 - £, sin 20, & = sin 20 + £ 3 cos 20, %’ 2 = % 2 . 

§ 51. The Fourier resolution of the electrostatic field 

The field produced by charges can also be formally expanded in plane waves (in a Fourier 
integral). This expansion, however, is essentially different from the expansion of electromagnetic 
waves in vacuum, for the field produced by charges does not satisfy the homogeneous wave 
equation, and therefore each term of this expansion does not satisfy the equation. From this 
it follows that for the plane waves into which the field of charges can be expanded, the 
relation k 2 = o?/c 2 , which holds for plane monochromatic electromagnetic waves, is not 
fulfilled. 

In particular, if we formally represent the electrostatic field as a superposition of plane 
waves, then the “frequency” of these waves is clearly zero, since the field under consideration 
does not depend on the time. The wave vectors themselves are, of course, different from 
zero. 

We consider the field produced by a point charge e, located at the origin of coordinates. 
The potential 0 of this field is determined by the equation (see § 36) 

A0 = ~Ane8{r). (51.1) 

We expand 0 in a Fourier integral, i.e. we represent it in the form 
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<p= 


W 


(51.2) 


where d 3 k denotes dk x dk y dk z . In this formula <J ) k = J <p(r)e ,k r dV. Applying the Laplace 
operator to both sides of (51.2), we obtain 


A <p 


-b 


T <pk 


d 3 k 
(2k) 3 ’ 


so that the Fourier component of the expression A<p is 
(A 0k = -lc(j> k . 

On the other hand, we can find (A <p) k by taking Fourier components of both sides of 
equation (51.1), 


(A 0) k 


J 4neS(r)e * T dV= - 4ne. 


Equating the two expressions obtained for (A <j)) k , we find 


This formula solves our problem. 

Just as for the potential 0, we can expand the field 


(51.3) 


E =J 


Eke* 


d 3 k 
(2k) 3 ' 


(51.4) 


With the aid of (51.2), we have 


E = - 


grad J <p k e ,k T 


d 3 k 

(2k) 3 


ik0 k e‘ 


d 3 k 
(2k) 3 ’ 


Comparing with (51.4), we obtain 


E k = -iHk = ~i - 


(51.5) 


From this we see that the field of the waves, into which we have resolved the Coulomb field, 
is directed along the wave vector. Therefore these waves can be said to be longitudinal. 


§ 52. Characteristic vibrations of the field 

We consider an electromagnetic field (in the absence of charges) in some finite volume of 
space. To simplify further calculations we assume that this volume has the form of a rectangular 
parallelepiped with sides A, P, C, respectively. Then we can expand all quantities characterizing 
the field in this parallelepiped in a triple Fourier series (for the three coordinates). This 
expansion can be written (e.g. for the vector potential) in the form: 
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A = 'ZA k e ikr (5 2 .i) 

explicitly indicating that A is real. The summation extends here over all possible values of 
the vector k whose components run through the values 


k x 


2 nn x _ 2nn y 2 nn z 

a ’ k y-~W~' k *-—* 


(52.2) 


where n x , n y , n, are positive or negative integers. Since A is real, the coefficients in the 
expansion (52.1) are related by the equations A_ k = A k . From the equation div A = 0 it 
tollows that for each k. 


k ' A k = 0, (52.3) 

i.e., the complex vectors A k are “perpendicular” to the corresponding wave vectors k. The 
vectors A k are, of course, functions of the time; from the wave equation (46.7), they satisfy 
the equation J 3 


A k + c 2 k 2 A k = 0. ( 52 . 4 ) 

If the dimensions A, B, C of the volume are sufficiently large, then neighbouring values 
ol k x , k r k z (for which n„ n y , n, differ by unity) are very close to one another. In this case we 
may speak of the number of possible values of k x , k y , k 7 in the small intervals Ak x , Ak„Ak. 

Since to neighbouring values of, say, k x , there correspond values of n x differing by unity 
the number A n x of possible values of k x in the interval Ak x is equal simply to the number of 
values of n x m the corresponding interval. Thus, we obtain 


A "‘ = M *' 4 "- = S M >- = 

The total number An of possible values of the vector k with components in the intervals Ak 
Aky, A k z is equal to the product An x An y An z , that is, ” 


An = -~—^-Ak x Ak.,Ak ,, 
(2tf) 


(52.5) 


where V= AfiC is the volume of the field. It is easy to determine from this the number of 
possible values of the wave vector having absolute values in the interval Ak, and directed 
mto the element of solid angle A o. To get this we need only transform to polar coordinates 
in the k space” and write in place of Ak x Ak y Ak z the element of volume in these coordinates 
Thus 


~ (2^7 AkA °■ (52.6) 

Replacing Av by 4 n, we find the number of possible values of k with absolute value in the 
interval Ak and pointing in all directions: An = (V/2n l )k 1 Ak. 

We calculate the total energy 

*= J (E 2 + H 2 )dV 

of the field, expressing it in terms of the quantities A k . For the electric and magnetic fields 


E = --U = -iZA k<? fk - r , 



(52.7) 


H = curl A = /1 (k x A k )e ik r . 

When calculating the squares of these sums, we must keep in mind that all products of terms 
with wave vectors k and k' such that k * k' give zero on integration over the whole volume. 
In fact, such terms contain factors of the form e' (k+k ,r , and the integral, e.g. of 



with integer n x different from zero, gives zero. In those terms with k' = -k, the exponentials 
drop out and integration over dV gives just the volume V. 

As a result, we obtain 

•AImixao-cxx;*}. 

From (52.3), we have 

(k x A k ) • (k x A k ) = £ 2 A k • A k , 

so that 

<528) 

Each term of this sum corresponds to one of the terms of the expansion (52.1). 

Because of (52.4), the vectors A k are harmonic functions of the time with frequencies 
= ck, depending only on the absolute value of the wave vector. Depending on the choice of 
these functions, the terms in the expansion (52.1) can represent standing or running plane 
waves. We shall write the expansion so that its terms describe running waves. To do this we 
write it in the form 

A = £ (a k c ,l£ r + ate * r ) (52.9) 

which explicitly exhibits that A is real, and each of the vectors a k depends on the time 
according to the law 

a k ~ e~ iC0kt , (O k =ck. (52.10) 

Then each individual term in the sum (52.9) will be a function only of the difference 
k • r - oy, which corresponds to a wave propagating in the k direction. 

Comparing the expansions (52.9 ) and (52.1), we find that their coefficients are related by 
the formulas 

A k = a k + a! k , 

and from (52.10) the time derivatives are related by 

A k = -ick{ a k - a! k ). 

Substituting in (52.8), we express the field energy in terms of the coefficients of the expansion 
(52.9). Terms with products of the form a k - a_ k or a k • al k cancel one another; also noting 
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that the sums Ea k • a k and Xa_ k al k differ only in the labelling of the summation index, 
and therefore coincide, we finally obtain: 


if — X if k , if k : 


(52.11) 


Thus the total energy of the field is expressed as a sum of the energies if k , associated with 
each of the plane waves individually. 

In a completely analogous fashion, we can calculate the total momentum of the field, 

J SdV= 4 m J E X HJV ’ 

for which we obtain 


(52.12) 


This result could have been anticipated in view of the relation between the energy and 
momentum of a plane wave (see § 47). 

The expansion (52.9) succeeds in expressing the field in terms of a series of discrete 
parameters (the vectors a k ), in place of the description in terms of a continuous series of 
parameters, which is essentially what is done when we give the potential A(jc, y, z, t) at all 
points of space. We now make a transformation of the variables a k , which has the result that 
the equations of the field take on a form similar to the canonical equations (Hamilton 
equations) of mechanics. 

We introduce the real “canonical variables” Q k and P k according to the relations 


Q k 


r(a k + a k ). 


(52.13) 


" a k ) - Qk- 


The Hamiltonian of the field is obtained 
(52.11): 


by substituting these expressions in the energy 


E ^ k = 11 (P k 2 +0) 2 k Q 2 k ). (52.14) 

Then the Hamilton equation d<%/dP k = Q k coincide with P k = Q k , which is thus a 
consequence of the equations of motion. (This was achieved by an appropriate choice of the 
coefficient in (52.13).) The equations of motion, d£?ldQ k = - P k , become the equations 

Qk + « k Q k = 0, (52.15) 

that is, they are identical with the equations of the field. 

Each of the vectors Q k and P k is perpendicular to the wave vector k, i.e. has two independent 
components. The direction of these vectors determines the direction of polarization of the 
corresponding travelling wave. Denoting the two components of the vector Q k (in the plane 
perpendicular to k) by Q kj , j = 1, 2, we have 

0 i-ZQl- 
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j <%kj - \( P kj + °>kQkj)- (52.16) 

We see that the Hamiltonian splits into a sum of independent terms each of which 
contains only one pair of the quantities Q kj , P kj . Each such term corresponds to a travelling 
wave with a definite wave vector and polarization. The quantity c% k j has the form of the 
Hamiltonian of a one-dimensional “oscillator”, performing a simple harmonic vibration. For 
this reason, one sometimes refers to this result as the expansion of the field in terms of 
oscillators. 

We give the formulas which express the field explicitly in terms of the variables P k , Q k . 
From (52.13), we have 

ak = Wf < Pk - i 0 ) kQk), »k = + i 0 ) kQk)- (52.17) 

Substituting these expressions in (52.1), we obtain for the vector potential of the field: 

A = 2 ^ £ |-(cA:Q k cos k r - P k sin k r). (52.18) 

For the electric and magnetic fields, we find 

E = -2^- £(c*Qk sin k r + P k cos k r), 

sin k * r + (k x ) cos k * r|. (52.19) 
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§ 53. Geometrical optics 

A plane wave is characterized by the property that its direction of propagation and amplitude 
are the same everywhere. Arbitrary electromagnetic waves, of course, do not have this 
property. Nevertheless, a great many electromagnetic waves, which are not plane, have the 
property that within each small region of space they can be considered to be plane. For this, 
it is clearly necessary that the amplitude and direction of the wave remain practically 
constant over distances of the order of the wavelength. If this condition is satisfied, we can 
introduce the so-called wave surface , i.e. a surface at all of whose points the phase of the 
wave is the same (at a given time). (The wave surfaces of a plane wave are obviously planes 
perpendicular to the direction of propagation of the wave.) In each small region of space we 
can speak of a direction of propagation of the wave, normal to the wave surface. In this way 
we can introduce the concept of rays —curves whose tangents at each point coincide with 
the direction of propagation of the wave. 

The study of the laws of propagation of waves in this case constitutes the domain of 
geometrical optics. Consequently, geometrical optics considers the propagation of waves in 
particular of light, as the propagation of rays, completely divorced from their wave properties, 
higher words, geometrical optics corresponds to the limiting case of small wavelength. 

We now take up the derivation of the fundamental equation of geometrical optics_the 

equation determining the direction of the rays. Let/be any quantity describing the field of 
the wave (any component of E or H). For a plane monochromatic wave,/has the form 

/ = oeKk-r^Hx) = ae ,<-k^ +a) (531) 

(we omit the Re; it is understood that we take the real part of all expressions). 

We write the expression for the field in the form 

/= ae ‘ V - (53.2) 

In case the wave is not plane, but geometrical optics is applicable, the amplitude a is, 
generally speaking, a function of the coordinates and time, and the phase y which is called 
the eikonal, does not have a simple form, as in (53.1). It is essential, however, that ybe a 
large quantity. This is clear immediately from the fact that it changes by 2n when we move 
through one wavelength, and geometrical optics corresponds to the limit A -» 0. 

Over small space regions and time intervals the eikonal yean be expanded in series; to 
terms of first order, we have 
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W= Vo + 


dy/ dy/ 
~di 7 + t ~di 


(the origin for coordinates and time has been chosen within the space region and time 
interval under consideration; the derivatives are evaluated at the origin). Comparing this 
expression with (53.1), we can write 

k = = grad y/, ( 0 - (53.3) 

dr dt 


which corresponds to the fact that in each small region of space (and each small interval of 
time) the wave can be considered as plane. In four-dimensional form, the relation (53.3) is 
expressed as 


dy/ 
dx 1 ’ 


(53.4) 


where k, is the wave four-vector. 

We saw in § 48 that the components of the four-vector k' are related by A,/c' = 0. Substituting 
(53.4), we obtain the equation 


dy dy/ ^ 

dxj dx' 


(53.5) 


This equation, the eikonal equation, is the fundamental equation of geometrical optics. 

The eikonal equation can also be derived by direct transition to the limit X -»0 in the wave 
equation. The field /satisfies the wave equation 


Substituting /= ae' v , we obtain 


d 2 f 

dxjdx' 


0 . 


d 2 a ,y 2 iw 
f)x,dx‘ dxj dx ' 


+ if 


d 2 yt 


dy dyr 
d Xi ‘ dx ' T 


(53.6) 


But the eikonal y/, as we pointed out above, is a large quantity; therefore we can neglect the 
first three terms compared with the fourth, and we arrive once more at equation (53.5). 

We shall give certain relations which, in their application to the propagation of light in 
vacuum, lead only to completely obvious results. Nevertheless, they are important because, 
in their general form, these derivations apply also to the propagation of light in material 
media. 

From the form of the eikonal equation there results a remarkable analogy between geometrical 
optics and the mechanics of material particles. The motion of a material particle is determined 
by the Hamilton-Jacobi equation (16.11). This equation, like the eikonal equation, is an 
equation in the first partial derivatives and is of second degree. As we know, the action S is 
related to the momentum p and the Hamiltonian J( of the particle by the relations 


P = 


dS 
dt • 
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Comparing these formulas with the formulas t53 31 wp that m 

role in geomenica, opdcs as ,he £ 

For a particle, we have the Hamilton equations 


P = — 


dy 


dy 


4”:'° 8y We haVe POi " ,ed ° U '- We ' mmediatd > corresponding 


l)_ dC0 

~ dr ’ 


_ d(0 


(53.7) 


mmsmrnm 

P ~ Ak \ ^53 

where the coefficient of proportionality t4 between the two four-vectors P‘ and V is so™ 
scalar. In three-dimensional form this relation gives: ome 


P = t4k, <f=Aco. 


’ (53.9) 

mmsmm 
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<r= -«»+w»,y,z), (53.10) 

where Vo is a funcdon only of the coordlna.es. The eikonal equaiion (53.5) now lakes ,he 
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§ 54 

(gradyq,) 2 =^r- (53.11) 

The wave surfaces are the surfaces of constant eikonal, i.e. the family of surfaces of the form 
Wo (*> z) = const. The rays themselves are at each point normal to the corresponding wave 
surface; their direction is determined by the gradient V «// 0 . 

As is well known, in the case where the energy is constant, the principle of least action for 
particles can also be written in the form of the so-called principle of Maupertuis: 

8S = djpdl = 0, 

where the integration extends over the trajectory of the particle between two of its points. In 
this expression the momentum is assumed to be a function of the energy and the coordinates. 
The analogous principle for rays is called Fermat’s principle. In this case, we can write by 
analogy: 

Sy/= sj k rfl = 0. (53.12) 

In vacuum, k = (to/c) n, and we obtain (dl ■ n = dl): 

8^dl = 0, (53.13) 

which corresponds to rectilinear propagation of the rays. 

§ 54. Intensity 

In geometrical optics, the light wave can be considered as a bundle of rays. The rays 
themselves, however, determine only the direction of propagation of the light at each point; 
there remains the question of the distribution of the light intensity in space. 

On some wave surface of the bundle of rays under consideration, we isolate an infinitesimal 
surface element. From differential geometry it is known that every surface has, at each of its 
points, two (generally different) principal radii of curvature. Let ac and bd (Fig. 7) be 
elements of the principal circles of curvature, constructed at a given element of the wave 
surface. Then the rays passing through a and c meet at the corresponding centre of curvature 
Oj, while the rays passing through b and d meet at the other centre of curvature 0 2 - 



Fig. 7. 
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F °r fixetJ angular openings of the beams starting from O, and 0 2 , the lengths of the arcs 
ac and bd are, clearly, proportional to the corresponding radii of curvature R, and R 2 (i e to 
the engths 0,0 and 0 2 0). The area of the surface element is proportional to the product of 
the lengths and bd, i.e., proportional to R,R 2 . In other words, if we consider the element 
of the wave surface bounded by a definite set of rays, then as we move along them the area 
of the element will change proportionally to R,R 2 . 

On the other hand, the intensity, i.e. the energy flux density, is inversely proportional to 
the surface area through which a given amount of light energy passes. Thus we arrive at the 
result that the intensity is e 


1 = 


const 
R,R, ' 


(54.1) 

This formula must be understood as follows. On each ray (AB in Fig. 7) there are definite 
points O, and 0 2 , which are the centres of curvature of all the wave surfaces intersecting the 
given ray. The distances OO, and 00 2 from the point O where the wave surface intersects 
the °\ a " d ^’ f e the rad,i of curvature R, and R 2 of the wave surface at 

the point O. Thus formula (54.1) determines the change in intensity of the light along a given 
ray as a function of the distances from definite points on this ray. We emphasize that this 
formula cannot be used to compare intensities at different points on a single wave surface 

ch rl rr^^T™" by the Square m ° dulus of the field ’ we can write for the 

change of the field itself along the ray 


f= 


const ikR 


(54.2) 


where in the phase factor e' kR we can write either e ikR ' or e ikR \ The quantities e ikR ' and 
e 2 . (tor a given ray) differ from each other only by a constant factor, since the difference 
R\ - R 2 , the distance between the two centres of curvature, is a constant 
If the two radii of curvature of the wave surface coincide,’ then (54.1) and (54 2) have the 


/= 


const 

R 


(54.3) 


This happens always when the light is emitted from a point source (the wave surfaces are 
then concentric spheres and R is the distance from the light source). 

From (54.1) we see that the intensity becomes infinite at the points R, = 0 R 2 = 0 i e at 
the centres of curvature of the wave surface. Applying this to all the rays in a bundle, we find 
hat the intensity of the light in the given bundle becomes infinite, generally, on two surfaces- 
the geometrical loci of all the centres of curvature of the wave surfaces. These surfaces are 
c lied caustics. In the special case of a beam of rays with spherical wave surfaces the two 
caustics fuse into a single point (focus). 

We note from well-known results of differential geometry concerning the properties of the 
loci of centres of curvature of a family of surfaces, that the rays are tangent to the caustic 
It is necessary to keep in mind that (for convex wave surfaces) the centres of curvature of 
the wave surfaces can turn out to lie not on the rays themselves, but on their extensions 
beyond the optical system from which they emerge. In such cases we speak of imaginary 
caustics (or foci). In this case the intensity of the light does not become infinite anywhere 
As for the increase of intensity to infinity, in actuality we must understand that the 
intensity does become large at points on the caustic, but it remains finite (see the problem 
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in § 59). The formal increase to infinity means that the approximation of geometrical optics 
is never applicable in the neighbourhood of the caustic. To this is related the fact that the 
change in phase along the ray can be determined from formula (54.2) only over sections of 
the ray which do not include its point of tangency to the caustic. Later (in § 59), we shall 
show that actually in passing through the caustic the phase of the field decreases by n!2. This 
means that if, on the section of the ray before its first intersection with the caustic, the field 
is proportional to the factor e lkx (x is the coordinate along the ray), then after passage through 
the caustic the field will be proportional to The same thing occurs in the neighbourhood 

of the point of tangency to the second caustic, and beyond that point the field is proportional 

toe i(kx-K)j 

§ 55. The angular eikonal 

Alight ray travelling in vacuum and impinging on a transparent body will, on its emergence 
from this body, generally have a direction different from its initial direction. This change in 
direction will, of course, depend on the specific properties of the body and on its form. 
However, it turns out that one can derive general laws relating to the change in direction of 
a light ray on passage through an arbitrary material body. In this it is assumed only that 
geometrical optics is applicable to rays propagating in the interior of the body under 
consideration. As is customary, we shall call such transparent bodies, through which rays of 
light propagate, optical systems. 

Because of the analogy mentioned in § 53, between the propagation of rays and the motion 
of particles, the same general laws are valid for the change in direction of motion of a 
particle, initially moving in a straight line in vacuum, then passing through some electromagnetic 
field, and once more emerging into vacuum. For definiteness, we shall, however, always 
speak later of the propagation of light rays. 

We saw in a previous section that the eikonal equation, describing the propagation of the 
rays, can be written in the form (53.11) (for light of a definite frequency). From now on we 
shall, for convenience, designate by y/ the eikonal y/ 0 divided by the constant (o/c. Then the 
basic equation of geometrical optics has the form: 

(Vy/) 2 = 1. (55.1) 

Each solution of this equation describes a definite beam of rays, in which the direction of 
the rays passing through a given point in space is determined by the gradient of y/ at that 
point. However, for our purposes this description is insufficient, since we are seeking general 
relations determining the passage through an optical system not of a single definite bundle 
of rays, but of arbitrary rays. Therefore we must use an eikonal expressed in such a form that 
it describes all the generally possible rays of light, i.e. rays passing through any pair of 
points in space. In its usual form the eikonal y/(r) is the phase of the rays in a certain bundle 
passing through the point r. Now we must introduce the eikonal as a function y/fr, r') of the 
coordinates of two points (r, r' are the radius vectors of the initial and end points of the ray). 
A ray can pass through each pair of points r, r', and y/fr, r') is the phase difference (or, as 
it is called, the optical path length) of this ray between the points r and r'. From now on we 
shall always understand by r and r' the radius vectors to points on the ray before and after 
its passage through the optical system. 

t Although formula (54.2) itself is not valid near the caustic, the change in phase of the field corresponds 
formally to a change in sign (i.e. multiplication by e iK ) of R, or R 2 in this formula. 
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n ^j n r ') one f the radius vectors, say r', is fixed, then yas a function of r describes 
definite bundle of rays, namely, the bundle of rays passing through the point r' Then w 
must satisfy equation (55.1), where the differentiations are applied to the components of ^ 
Similarly, if r is assumed fixed, we again obtain an equation for y(r, r'), so that 

(V r y) 2 = 1, (V r , y) 2 = 1. (55.2) 

The direction of the ray is determined by the gradient of its phase. Since y{r, r') is the 
difference in phase at the points r and r\ the direction of the ray at the point r' is given by 
the vector n_ dy/dr , and at the point r by the vector n = - dy/dr. From (55.2) if is clear 
that n and n are unit vectors: J 


i 2 = n' 2 = 1. 


(55.3) 

The four vectors r r', n, n' are interrelated, since two of them (n, n') are derivatives of a 

i:^sr to the other two (r> r °- The taion * itseif s “ s * 

To obtain the relation between n, n', r, r\ it is convenient to introduce, in place of y 
another quantj y, on which no auxiliary condition is imposed (i.e., is not required to satisfy 
any differential equations). This can be done as follows. In the function y the independent 
variables are r and r, so that for the differential dy we have 


, dy dy 

d V~ ‘ dr + ^7 • dr' = - n ■ dr + n' ■ dr'. 

n'ThaUr “ Cgendre transformation from r > r ' to the new independent variables n, 

dy= - d(n • r) + r - dn + d(n' ■ r'( - r' • dn'. 
from which, introducing the function 

X = n' r' - n • r - y, 

we have 


dy 


(55.4) 


• dn'. 


(55.5) 


dx = -r dn + r' ■ 

The function x is called the angular eikonal ; as we see from (55.5), the independent 
variables in it are n and n'. No auxiliary conditions are imposed on x In fact, equation (55 3) 
now states only a condition referring to the independent variables: of the three components 

° f ll ! e '| ector n < and similarly for iT), only two are independent. As independent 
variables we shall use n y , n z , n ' y , n ' z ; then 


n x = a A- n 2 -n 2 , n' x = . 

Substituting these expressions in 

d X = -x dn x — y dn y - z dn z + x'dn x + y' dn' y + z' dn ', 
we obtain for the differential d%: 


d X 


1 ' ( y ' £ x ) - i z ~ t- *)'{>'■- 1 


§ 56 NARROW BUNDLES OF RAYS 

From this we obtain, finally, the following equations: 
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n y„_ ix_ 

n x " dn 


dx_ 

dn z ’ 


y' - 


n' 

- 7~x' 


n x 


d X ,, < , _ dx 

dn' y ’ n' x dn' z ’ 


(55.6) 


which is the relation sought between n, n', r, r. The function % characterizes the special 
properties of the body through which the rays pass (or the properties of the field, in the case 
of the motion of a charged particle). 

For fixed values of n, n', each of the two pairs of equations (55.6) represent a straight line. 
These lines are precisely the rays before and after passage through the optical system. Thus 
the equation (55.6) directly determines the path of the ray on the two sides of the optical 
system. 


§ 56. Narrow bundles of rays 

In studying the passage of beams of rays through optical systems, special interest attaches 
to bundles whose rays all pass through one point (such bundles are said to be homocentric ). 

After passage through an optical system, homocentric bundles in general cease to be 
homocentric, i.e. after passing through a body the rays no longer come together in any one 
point. Only in exceptional cases will the rays starting from a luminous point come together 
after passage through an optical system and all meet at one point (the image of the luminous 
point).f 

One can show (see § 57) that the only case for which all homocentric bundles remain 
strictly homocentric after passage through the optical system is the case of identical imaging, 
i.e. the case where the image differs from the object only in its position or orientation, or is 
mirror inverted. 

Thus no optical system can give a completely sharp image of an object (having finite 
dimensions) except in the trivial case of identical imaging.^: Only approximate, but not 
completely sharp images can be produced of an extended body, in any case other than for 
identical imaging. 

The most important case where there is approximate transition of homocentric bundles 
into homocentric bundles is that of sufficiently narrow beams (i.e. beams with a small 
opening angle) passing close to a particular line (for a given optical system). This line is 
called the optic axis of the system. 

Nevertheless, we must note that even infinitely narrow bundles of rays (in the three- 
dimensional case) are in general not homocentric; we have seen (Fig. 7) that even in such 
a bundle different rays intersect at different points (this phenomenon is called astigmatism). 
Exceptions are those points of the wave surface at which the two principal radii of curvature 
are equal—a small region of the surface in the neighbourhood of such points can be considered 
as spherical, and the corresponding narrow bundle of rays is homocentric. 


t The point of intersection can lie either on the rays themselves or on their continuations; depending on 
this, the image is said to be real or virtual. 
t Such imaging can be produced with a plane mirror. 
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We consider an optical system having axial symmetry.! The axis of symmetry of the 
system is also its optical axis. The wave surface of a bundle of rays travelling along this axis 
also has axial symmetry; as we know, surfaces of rotation have equal radii of curvature at 
their points of intersection with the symmetry axis. Therefore a narrow bundle moving in 
this direction remains homocentric. 

To obtain general quantitative relations, determining image formation with the aid of 
narrow bundles, passing through an axially-symmetric optical system, we use the general 
equations (55.6) after determining first of all the form of the function % in the case under 
consideration. 

Since the bundles of rays are narrow and move in the neighbourhood of the optical axis, 
the vectors n, n' for each bundle are directed almost along this axis. If we choose the optical 
axis as the X axis, then the components, n y , n z , n '., n' will be small compared with unity. As 
for the components n x , n ’ x ; n x ~ 1 and n' x can be approximately equal to either +1 or -1. In 
the first case the rays continue to travel almost in their original direction, emerging into the 
space on the other side of the optical system, which in this case is called a lens. In the second 
the rays change their direction to almost the reverse; such an optical system is called a 
mirror . 

Making use of the smallness of n y , n z , n y ,n' z , we expand the angular eikonal 
% (n y , n z , n ' y , n ' z ) in series and stop at the first terms. Because of the axial symmetry of the 
whole system, x must be invariant with respect to rotations of the coordinate system around 
the optical axis. From this it is clear that in the expansion of x there can be no terms of first 
order, proportional to the first powers of the y- and z-components of the vectors n and n'; 
such terms would not have the required invariance. The terms of second order which have 
the required property are the squares n 2 and n' 2 and the scalar product n • n'. Thus, to terms 
of second order, the angular eikonal of an axially-symmetric optical system has the form 


X = const + |(n 2 + n 2 ) + f(n y n' y + n z n ' z ) + ~(n' y 2 + n' z 2 ), (56.1) 

where /, g, h are constants. 

For definiteness, we now consider a lens, so that we set n' x ~ 1; for a mirror, as we shall 
show later, all the formulas have a similar appearance. Now substituting the expression 
(56.1) in the general equations (55.6), we obtain: 

n y(x ~ g) -fn y = y, fn y + n' y {x' + h) = y', 

n z(x ~ g) ~fn' z = z, fn z + n' (x' + h) = z'. (56.2) 

We consider a homocentric bundle emanating from the point x, y, z; let the point x', y, z 
be the point in which all the rays of the bundle intersect after passing through the lens. If the 
first and second pairs of equations (56.2) were independent, then these four equations, for 
given x, y, z, x', /, z\ would determine one definite set of values n y , n z , /;', n ' z , that is, there 
would be just one ray starting from the point x, y, z, which would pass through the point x', 
/, z'. In order that all rays starting from x, y, z shall pass through x', /, z', it is consequently 
necessary that the equations (56.2) not be independent, that is, one pair of these equations 
must be a consequence of the other. The necessary condition for this dependence is that the 


t It can be shown that the problem of image formation with the aid of narrow bundles, moving in the 
neighbourhood of the optical axis in a nonaxially-symmetric system, can be reduced to image formation in 
an axially-symmetric system plus a subsequent rotation of the image thus obtained, relative to the object. 
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coefficients in the one pair of equations be proportional to the coefficients of the other pair. 
Thus we must have 


In particular. 


x ~8 = f _ y _ z 
f x' + h y' z'* 


(x-g)(x' + h) = -f 2 . 


(56.3) 

(56.4) 


The equations we have obtained give the required connection between the coordinates of 
the image and object for image formation using narrow bundles. 

The points x = g and x' = - h on the optical axis are called the principal foci of the optical 
system. Let us consider bundles of rays parallel to the optical axis. The source point of such 
rays is, clearly, located at infinity on the optical axis, that is, x = From (56.3) we see that 
in this case, x' = - h. Thus a parallel bundle of rays, after passage through the optical system, 
intersects at the principal focus. Conversely, a bundle of rays emerging from the principal 
focus becomes parallel after passage through the system. 

In the equation (56.3) the coordinates x and x' are measured from the same origin of 
coordinates, lying on the optical axis. It is, however, more convenient to measure the 
coordinates of object and image from different origins, choosing them at the corresponding 
principal foci. As positive direction of the coordinates we choose the direction from the 
corresponding focus toward the side to which the light travels. Designating the new co¬ 
ordinates of object and image by capital letters, we have 


X = x-g, X' = x' + h, Y = y, Y’=y', Z = z, Z’ = z'. 


The equations of image formation (56.3) and (56.4) in the new coordinates take the form 


XX' = -f, 


(56.5) 


_ Zf_L- _x_ 

z X f 


The quantity /is called the principal focal length of the system. 

The ratio Y'/Y is called the lateral magnification. As for the longitudinal magnification, 
since the coordinates are not simply proportional to each other, it must be written in differential 
form, comparing the length of an element of the object (along the direction of the axis) with 
the length of the corresponding element in the image. From (56.5) we get for the “longitudinal 
magnification” 


dX' _f^_ 
dX x 2 



(56.7) 


We see from this that even for an infinitely small object, it is impossible to obtain a 
geometrically similar image. The longitudinal magnification is never equal to the transverse 
(except in the trivial case of identical imaging). 

A bundle passing through the point X = / on the optical axis intersects once more at the 
point X' = ~f on the axis; these two points are called principal points. From equation (56.2) 
(n ,,X - fn'y = Y, n z X -fn' z = Z) it is clear that in this case (X-f Y = Z = 0), we have the 
equations n y = n ' y , n z = n' z . Thus every ray starting from a principal point crosses the 
optical axis again at the other principal point in a direction parallel to its original direction. 
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If the coordinates of object and image are measured from the principal points (and not 
from the principal foci), then for these coordinates t, and we have 

r=*'+/. $=*-/■ 

Substituting in (56.5) it is easy to obtain the equations of image formation in the form 


I__L 

4 


(56.8) 


One can show that for an optical system with small thickness (for example, a mirror or a 
thin lens), the two principal points almost coincide. In this case the equation (56.8) is 
particularly convenient, since in it t, and are then measured practically from one and the 
same point. 

If the focal distance is positive, then objects located in front of the focus (X > 0) are 
imaged erect (Y'/Y > 0); such optical systems are said to be converging. If/< 0, then for 
X > 0 we have Y'/Y < 0, that is, the object is imaged in inverted form; such systems are said 
to be diverging. 

There is one limiting case of image formation which is not contained in the formulas 
(56.8); this is the case where all three coefficients/, g , h are infinite (i.e. the optical system 
has an infinite focal distance and its principal foci are located at infinity). Going to the limit 
of infinite /, g, h in (56.4) we obtain 


Since we are interested only in the case where the object and its image are located at finite 
distances from the optical system, /, g, h must approach infinity in such fashion that the 
ratios h/g, (/ 2 - gh)/g are finite. Denoting them, respectively, by or 2 and (3, we have 

x = cfix + p. 

For the other two coordinates we now have from the general equation (56.7): 

— - — = ±a. 


Finally, again measuring the coordinates x and x' from different origins, namely from some 
arbitrary point on the axis and from the image of this point, respectively, we finally obtain 
the equations of image formation in the simple form 

X' = c?X, r = ±aY, Z' = ±ctZ. (56.9) 

Thus the longitudinal and transverse magnifications are constants (but not equal to each 
other). This case of image formation is called telescopic. 

All the equations (56.5) through (56.9), derived by us for lenses, apply equally to mirrors, 
and even to an optical system without axial symmetry, if only the image formation occurs 
by means of narrow bundles of rays travelling near the optical axis. In this, the reference 
points for the x coordinates of object and image must always be chosen along the optical 
axis from corresponding points (principal foci or principal points) in the direction of propagation 
of the ray. In doing this, we must keep in mind that for an optical system not possessing axial 
symmetry, the directions of the optical axis in front of and beyond the system do not lie in 
the same plane. 
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PROBLEMS 

1. Find the focal distance for image formation with the aid of two axially-symmetric optical systems 
whose optical axes, coincide. 

Solution: Let /[ and f 2 be the focal lengths of the two systems. For each system separately, we have 
X x X[ = -/i 2 , X 2 X’ 2 =-tf. 

Since the image produced by the first system acts as the object for the second, then denoting by / the 
distance between the rear principal focus of the first system and the front focus of the second, we have 
X 2 = Xf -/; expressing X 2 in terms of X,, we obtain 



from which it is clear that the principal foci of the composite system are located at the points X, = 
~/ 2 //, X 2 = f 2 H and the focal length is 

, fxfi 

T ~ l 

(to choose the sign of this expression, we must write the corresponding equation for the transverse magnification). 
3 

J_ * 2 

x 0 x 


Fig. 8. 

In case 1 = 0, the focal length / = °=, that is, the composite system gives telescopic image formation. In 
this case we have X 2 = X t (f 2 lf\ ) 2 , that is, the parameter a in the general formula (56.9) is a =f 2 lfi- 

2. Find the focal length for charged particles of a “magnetic lens” in the form of a longitudinal homogeneous 
field in the section of length / (Fig. 8).t 

Solution: The kinetic energy of the particle is conserved during its motion in a magnetic field; therefore 
the Hamilton-Jacobi equation for the reduced action S 0 (r) (where the total action is S = - 6 , + ,S 0 ) is 

( V5o_ ^ A ) =p2 ’ 



Using formula (19.4) for the vector potential of the homogeneous magnetic field, choosing the x axis along 
the field direction and considering this axis as the optical axis of an axially-symmetric optical system, we 
get the Hamilton-Jacobi equation in the form: 

t This might be the field inside a long solenoid, when we neglect the disturbance of the homogeneity of 
the field near the ends of the solenoid. 
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where r is the distance from the x axis, and S 0 is a function of x and r. 

For narrow beams of particles propagating close to the optical axis, the coordinate r is small, so that 
accordingly we try to find S 0 as a power series in r. The first two terms of this series are 

S 0 =px+\o{x)r 2 , (2) 

where o(x) satisfies the equation 

pa'(x) + o 2 =0. (3) 

In region 1 in front of the lens, we have: 


where x, < 0 is a constant. This solution corresponds to a free beam of particles, emerging along straight 
line rays from the point x = x, on the optical axis in region 1. In fact, the action function for the free motion 
of a particle with a momentum p in a direction out from the point x = x : is 


S 0 =p^ 2 + (x- xi ) 2 ^ p( x-x 1 ) + 1( ^- y 


Similarly, in region 2 behind the lens v. 



where the constant xr 2 is the coordinate of the image of the point x,. 

In region 3 inside the lens, the solution of equation (3) is obtained by separation of variables, and gives: 



where C is an arbitrary constant. 

The constant C and x 2 (for given *,) are determined by the requirements of continuity of <r(x) for x = 0 


eH 


t C, 


l-x 2 ■ 


Eliminating the constant C from these equations, we find: 

C*i - g) (x 2 + h) = -f 2 , 

wheret 


eH 


lep’ 


eHl ' 

n 2cp 


t The value of/is given with the correct sign. However, to show this requires additional investigation. 
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The formation of images with the aid of narrow bundles of rays, which was considered in 
the previous section, is approximate; it is the more exact (i.e. the sharper) the narrower the 
bundles. We now go over to the question of image formation with bundles of rays of 
arbitrary breadth. 

In contrast to the formation of an image of an object by narrow beams, which can be 
achieved for any optical system having axial symmetry, image formation with broad beams 
is possible only for specially constituted optical systems. Even with this limitation, as 
already pointed out in § 56, image formation is not possible for all points in space. 

The later derivations are based on the following essential remark. Suppose that all rays, 
starting from a certain point O and travelling through the optical system, intersect again at 
some other point O'. It is easy to see that the optical path length y/ is the same for all these 
rays. In the neighbourhood of each of the points O, O', the wave surfaces for the rays 
intersecting in them are spheres with centres at O and O', respectively, and, in the limit as 
we approach O and O', degenerate to these points. But the wave surfaces are the surfaces of 
constant phase, and therefore the change in phase along different rays, between their points 
of intersection with two given wave surfaces, is the same. From what has been said, it 
follows that the total change in phase between the points O and O' is the same (for the 
different rays). 

Let us consider the conditions which must be fulfilled in order to have formation of an 
image of a small line segment using broad beams; the image is then also a small line 
segment. We choose the directions of these segments as the directions of the £ and £,' axes, 
with origins at any two corresponding points O and O' of the object and image. Let y/be the 
optical path length for the rays starting from O and reaching O'. For the rays starting from 
a point infinitely near to O with coordinate d£, and arriving at a point of the image with 
coordinate d%', the optical path length is y/ + dy/, where 

We introduce the “magnification” 


as the ratio of the length d%' of the element of the image to the length dE, of the imaged 
element. Because of the smallness of the line segment which is being imaged, the quantity 
a can be considered constant along the line segment. Writing, as usual, dyf/d£ = -n%, dy/ld^' 
- n'^ (n^, are the cosines of the angles between the directions of the ray and the corresponding 
axes t, and <§'), we obtain 

dy/= (a^n's - )d£. 

As for every pair of corresponding points of object and image, the optical path length y/ + 
dy / must be the same for all rays starting from the point and arriving at the point d£'. 
From this we obtain the condition; 


a = const. (57.1) 

This is the condition we have been seeking, which the paths of the rays in the optical system 
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must satisfy in order to have image formation for a small line segment using broad beams. 
The relation (57.1) must be fulfilled for all rays starting from the point O. 

Let us apply this condition to image formation by means of an axially-symmetric optical 
system. We start with the image of a line segment coinciding with the optical axis (x axis); 
clearly the image also coincides with the axis. A ray moving along the optical axis ( n x =1), 
because of the axial symmetry of the system, does not change its direction after passing 
through it, that is, n' x is also 1. From this it follows that const in (57.1) is equal in this case 
to a x - 1, and we can rewrite (57.1) in the form 


Denoting by 0 and 0' the angles subtended by the rays with the optical axis at points of the 
object and image, we have 


1 - n, = 1 - cos 0=2 sin 2 1 - n' x - 1 - cos 0' = 2 sin 2 
Thus we obtain the condition for image formation in the form 

■ 0 

sin y 

--p- = const = Ja x . (57.2) 

sin T 

Next, let us consider the imaging of a small portion of a plane perpendicular to the optical 
axis of an axially symmetric system; the image will obviously also be perpendicular to this 
axis. Applying (57.1) to an arbitrary segment lying in the plane which is to be imaged, we 
get: 


a r sin O' - sin 0 = const, 

where 0 and 0' are again the angles made by the beam with the optical axis. For rays 
emerging from the point of intersection of the object plane with the optical axis, and directed 
along this axis (0 = 0), we must have 0' = 0, because of symmetry. Therefore const is zero, 
and we obtain the condition for imaging in the form 


(57.3) 


As for the formation of an image of a three-dimensional object using broad beams, it is 
easy to see that this is impossible even for a small volume, since the conditions (57.2) and 
(57.3) are incompatible. 


§ 58. The limits of geometrical optics 

From the definition of a monochromatic plane wave, its amplitude is the same everywhere 
and at all times. Such a wave is infinite in extent in all directions in space, and exists over 
the whole range of time fromto+°°. Any wave whose amplitude is not constant everywhere 
at all times can only be more or less monochromatic. We now take up the question of the 
“degree of non-monochromaticity” of a wave. 

Let us consider an electromagnetic wave whose amplitude at each point is a function of 
the time. Let (Oq be some average frequency of the wave. Then the field of the wave, for 



THE LIMITS OF GEOMETRICAL OPTICS 


155 


§ 58 

example the electric field, at a given point has the form E 0 (t)e 1,01)1 . This field, although it 
is of course not monochromatic, can be expanded in monochromatic waves, that is, in a 
Fourier integral. The amplitude of the component in the expansion, with frequency w, is 
proportional to the integral 


J E 0 (Oe^^'dt. 

The factor j s a periodic function whose average value is zero. If E 0 were exactly 

constant, then the integral would be exactly zero, for w (()(,. If, however, E 0 (r) is variable, 
but hardly changes over a time interval of order M\w- ct^l, then the integral is almost equal 
to zero, the more exactly the slower the variation of E 0 . In order for the integral to be 
significantly different from zero, it is necessary that E 0 (r) vary significantly over a time 
interval of the order of 1/1 w - tool. 

We denote by At the order of magnitude of the time interval during which the amplitude 
of the wave at a given point in space changes significantly. From these considerations, it 
now follows that the frequencies deviating most from too, which appear with reasonable 
intensity in the spectral resolution of this wave, are determined by the condition 1/1 to - tool 
~ At. If we denote by Ato the frequency interval (around the average frequency too) which 
enters in the spectral resolution of the wave, then we have the relation 

AwAt~l. (58.1) 

We see that a wave is the more monochromatic (i.e. the smaller Ato) the larger At, i.e. the 
slower the variation of the amplitude at a given point in space. 

Relations similar to (58.1) are easily derived for the wave vector. Let Ax, Ay, A z be the 
orders of magnitude of distances along the X, Y, Z axes, in which the wave amplitude 
changes significantly. At a given time, the field of the wave as a function of the coordinates 
has the form 

E 0 (r)c' k ° r , 

where k 0 is some average value of the wave vector. By a completely analogous derivation 
to that for (58.1) we can obtain the interval Ak of values contained in the expansion of the 
wave into a Fourier integral: 

Ak x Ax ~ 1, AkyAy ~ 1, M z Az~l. (58.2) 

Let us consider, in particular, a wave which is radiated during a finite time interval. We 
denote by At the order of magnitude of this interval. The amplitude at a given point in space 
changes significantly during the time At in the course of which the wave travels completely 
past the point. Because of the relations (58.1) we can now say that the “lack of 
monochromaticity” of such a wave, Aw, cannot be smaller than 1/Ar (it can of course be 
larger): 

Am>-J-. (58.3) 

At 

Similarly, if Ax, Ay, Az are the orders of magnitude of the extension of the wave in space, 
then for the spread in the values of components of the wave vector, entering in the resolution 
of the wave, we obtain 
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Ak x > A k y >-~, M z >4~- (58.4) 

Ax ’ y ~ Ay z Az 

From these formulas it follows that if we have a beam of light of finite width, then the 
direction of propagation of the light in such a beam cannot be strictly constant. Taking the 
X axis along the (average) direction of light in the beam, we obtain 


1 

“ kAy 


A 

Ay’ 


(58.5) 


where 6 X is the order of magnitude of the deviation of the beam from its average direction 
in the X Y plane and A is the wavelength. 

On the other hand, the formula (58.5) answers the question of the limit of sharpness of 
optical image formation. A beam of light whose rays, according to geometrical optics, would 
all intersect in a point, actually gives an image not in the form of a point but in the form of 
a spot. For the width A of this spot, we obtain, according to (58.5), 


A 


k6 


A 
e ’ 


(58.6) 


where 6 is the opening angle of the beam. This formula can be applied not only to the image 
but also to the object. Namely, we can state that in observing a beam of light emerging from 
a luminous point, this point cannot be distinguished from a body of dimensions XI6. In this 
way formula (58.6) determines the limiting resolving power of a microscope. The minimum 
value of A, which is reached for 6 ~ 1, is A, in complete agreement with the fact that the limit 
of geometrical optics is determined by the wavelength of the light. 


PROBLEM 

Determine the order of magnitude of the smallest width of a light beam produced from a parallel beam 
at a distance / from a diaphragm. 

Solution: Denoting the size of the aperture in the diaphragm by d, we have from (58.5) for the angle of 
deflection of the beam (the “diffraction angle”), A Id, so that the width of the beam is of order d + (A/d)l. The 
smallest value of this quantity ~ ffXl. 

§ 59. Diffraction 

The laws of geometrical optics are strictly correct only in the ideal case when the wavelength 
can be considered to be infinitely small. The more poorly this condition is fulfilled, the 
greater are the deviations from geometrical optics. Phenomenon which are the consequence 
of such deviations are called diffraction phenomena. 

Diffraction phenomena can be observed, for example, if along the path of propagation of 
the light! there is an obstacle—an opaque body (we call it a screen ) of arbitrary form or, for 
example, if the light passes through holes in opaque screens. If the laws of geometrical 
optics were strictly satisfied, there would be beyond the screen regions of “shadow” sharply 
delineated from regions where light falls. The diffraction has the consequence that, instead 
of a sharp boundary between light and shadow, there is a quite complex distribution of the 

t In what follows, in discussing diffraction we shall talk of the diffraction of light; all these same 
considerations also apply, of course, to any electromagnetic wave. 
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intensity of the light. These diffraction phenomena appear the more strongly the smaller the 
dimensions of the screens and the apertures in them, or the greater the wavelength. 

The problems of the theory of diffraction consists in determining, for given positions and 
shapes of the objects (and locations of the light sources), the distribution of the light, that is, 
the electromagnetic field over all space. The exact solution of this problem is possible only 
through solution of the wave equation with suitable boundary conditions at the surface of 
the body, these conditions being determined also by the optical properties of the material. 
Such a solution usually presents great mathematical difficulties. 

However, there is an approximate method which for many cases is a satisfactory solution 
of the problem of the distribution of light near the boundary between light and shadow. This 
method is applicable to cases of small deviation from geometrical optics, i.e. when firstly, 
the dimensions of all bodies are large compared with the wavelength (this requirement 
applies both to the dimensions of screens and apertures and also to the distances from the 
bodies to the points of emission and observation of the light); and secondly when there are 
only small deviations of the light from the directions of the rays given by geometrical optics. 

Let us consider a screen with an aperture through which the light passes from given 
sources. Figure 9 shows the screen in profile (the heavy line); the light travels from left to 
right. We denote by u some one of the components of E or H. Here we shall understand u 
to mean a function only of the coordinates, i.e. without the factor e determining the time 
dependence. Our problem is to determine the light intensity, that is, the field u, at any point 
of observation P beyond the screen. For an approximate solution of this problem m cases 
where the deviations from geometrical optics are small, we may assume that at the points of 
the aperture the field is the same as it would have been in the absence of the screen. In other 
words, the values of the field here are those which follow directly from geometrical optics. 
At all points immediately behind the screen, the field can be set equal to zero. In this the 
properties of the screen (i.e. of the screen material) obviously play no part. It is also obvious 
that in the cases we are considering, what is important for the diffraction is only the shape 
of the edge of the aperture, while the shape of the opaque screen is unimportant. 



Fig. 9. 

We introduce some surface which covers the aperture in the screen and is bounded by its 
edges (a profile of such a surface is shown in Fig. 9 as a dashed line). We break up this 
surface into sections with area df, whose dimensions are small compared with the size of the 
aperture, but large compared with the wavelength of the light. We can then consider each of 
these sections through which the light passes as if it were itself a source of light waves 
spreading out on all sides from this section. We shall consider the field at the point P to be 
the result of superposition of the fields produced by all the sections df of the surface 
covering the aperture. (This is called Huygens' principle.) 
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The field produced at the point P by the section df is obviously proportional to the value 
u of the field at the section df itself (we recall that the field at df is assumed to be the same 
as it would have been in the absence of the screen). In addition, it is proportional to the 
projection df, of the area df on the plane perpendicular to the direction n of the ray coming 
from the light source to df This follows from the fact that no matter what shape the element 
df has, the same rays will pass through it provided its projection df„ remain fixed and 
therefore its effect on the field at P will be the same. 

Thus the field produced at the point P by the section df is proportional to u df n . Furthermore 
we must still take into account the change in the amplitude and phase of the wave during its 
propagation from df to P . The law of this change is determined by formula (54.3) Therefore 
u df n must be multiplied by (\IR)e ,kR (where R is the distance from d/to P, and k is the 
absolute value of the wave vector of the light), and we find that the required field is 


where a is an as yet unknown constant. The field at the point P, being the result of the 
addition of the fields produced by all the elements df, is consequently equal to 



where the integral extends over the surface bounded by the edge of the aperture. In the 
approximation we are considering, this integral cannot, of course, depend on the form of this 
surface. Formula (59.1) is, obviously, applicable not only to diffraction by an aperture in a 
screen, but also to diffraction by a screen around which the light passes freely. In that case 
the surface of integration in (59.1) extends on all sides from the edge of the screen. 

To determine the constant a, we consider a plane wave propagating along the X axis; the 
wave surfaces are parallel to the plane YZ. Let u be the value of the field in the YZ plane 
Then at the point P, which we choose on the X axis, the field is equal to u p = ue ikx . On the 
other hand, the field at the point P can be determined starting from formula (59.1), choosing 
as surface of integration, for example, the YZ plane. In doing this, because of the smallness 
of the angle of diffraction, only those points of the YZ plane are important in the integral 
which he close to the origin, i.e. the points for which y,z«x (x is the coordinate of the 
point P). Then 


and (59.1) gives 


ikx r j -Z. r 

U P = au I e 2x dy I e 2x dz, 

where u is a constant (the field in the YZ plane); in the factor MR, we can put R = * = const. 
By the substitution y = ^2x/k these two integrals can be transformed to the integral 
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2in 

u„ - aue — 7 —. 
p k 

On the other hand, u p = ue ikx , and consequently 



Substituting in (59.1), we obtain the solution to our problem in the form 

(59 ' 2> 

In deriving formula (59.2), the light source was assumed to be essentially a point, and the 
light was assumed to be strictly monochromatic. The case of a real, extended source, which 
emits non-monochromatic light, does not, however, require special treatment. Because of 
the complete independence (incoherence) of the light emitted by different points of the 
source, and the incoherence of the different spectral components of the emitted light, the 
total diffraction pattern is simply the sum of the intensity distributions obtained from the 
diffraction of the independent components of the light. 

Let us apply formula (59.2) to the solution of the problem of the change in phase of a ray 
on passing through its point of tangency to the caustic (see the end of § 54). We choose as 
our surface of integration in (59.2) any wave surface, and determine the field u p at a point 
P, lying on some given ray at a distance x from its point of intersection with the wave surface 
we have chosen (we choose this point as coordinate origin O, and as YZ plane the plane 
tangent to the wave surface at the point O). In the integration of (59.2) only a small area of 
the wave surface in the neighbourhood of O is important. If the XY and XZ planes are chosen 
to coincide with the principal planes of curvature of the wave surface at the point O, then 
near this point the equation of the surface is 


2R X 2R 2 ’ 

where R x and R 2 are the radii of curvature. The distance R from the point on the wave surface 
with coordinates X, y, z, to the point P with coordinates x, 0, 0, is 

R = v /(*-*) 2 +/+z 2 ** + ~ 7^) + t[x ~ ‘ 


On the wave surface, the field u can be considered constant; the same applies to the factor 
MR. Since we are interested only in changes in the place of the wave, we drop coefficients 
and write simply 



(59.3) 


The centres of curvature of the wave surface lie on the ray we are considering, at the 
points x = R\ and x = R 2 , these are the points where the ray is tangent to the caustic. Suppose 
R 2< R X . For x < R 2 , the coefficients of i in the exponentials appearing in the two integrands 
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are positive, and each of these integrals is proportional to (1 + i). Therefore on the part of 
the ray before its first tangency to the caustic, we have u p ~ e lkx . For R 2 < x < R u that is, on 
the segment of the ray between its two points of tangency, the integral over y is proportional 
to 1 + i, but the integral over z is proportional to 1 - i, so that their product does not contain 
i. Thus we have here u p — ie lkx = e t(kx ~ (7tf2)) , that is, as the ray passes in the neighbourhood 
of the first caustic, its phase undergoes an additional change of -n/2. Finally, for x > R x , we 
have u p ~ - e ,kx - e ,(kx ~ n \ that is, on passing in the neighbourhood of the second caustic, the 
phase once more changes by - n/2. 


PROBLEM 

Determine the distribution of the light intensity in the neighbourhood of the point where the ray is tangent 
to the caustic. 

Solution: To solve the problem, we use formula (59.2), taking the integral in it over any wave surface 
which is sufficiently far from the point of tangency of the ray to the caustic. In Fig. 10, ab is a section of 
this wave surface, and a'b' is a section of the caustic; a'b' is the evolute of the curve ab. We are interested 
in the intensity distribution in the neighbourhood of the point O where the ray QO is tangent to the caustic; 
we assume the length D of the segment QO of the ray to be large. We denote by a the distance from the point 
O along the normal to the caustic, and assume positive values a for points on the normal in the direction of 
the centre of curvature. 



The integrand in (59.2) is a function of the distance R from the arbitrary point Q' on the wave surface to 
the point P. From a well-known property of the evolute, the sum of the length of the segment Q'O' of the 
tangent at the point O' and the length of the arc Off is equal to the length QO of the tangent at the point 
O. For points O and ff which are near to each other we have Off = 6 q (Q is the radius of curvature of the 
caustic at the point O). Therefore the length Q’O’ = D - GQ. The distance Q'O (along a straight line) is 
approximately (the angle 6 is assumed to be small) 

Q’O = Q'O' + Q sin 6 = D - 6q + Q sin 6 = D - Q 

Finally, the distance R = Q'P is equal to R = Q’O -a sin 6= Q'O - x 6, that is, 

R = D-xe-±Q6 3 . 

Substituting this expression in (59.2), we obtain 
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(the slowly varying factor 1 ID in the integrand is unimportant compared with the exponential factor, s 
assume it constant). Introducing the new integration variable E, = (kg/ 2) 1/3 6, we get 


where 0(1) is the Airy function.! 
For the intensity I~ I u p I 2 , we writ 


(concerning the choice of the constant factor, cf, below). 

For large positive values of x, we have from this the asymptotic formula 


that is, the intensity drops exponentially (shadow region). For large negative values of x, we have 


1 2(-x) 312 2 k 2 n \ 

3 ip 4 ’ 


that is, the intensity oscillates rapidly; its average value over these oscillations is 


From this meaning of the constant A is clear—it is the intensity far from the caustic which would be 
obtained from geometrical optics neglecting diffraction effects. 


t The Airy function 0(1) is defined a 




(see Quantum Mechanics, Mathematical Appendices, § b). For large positive values of the argument, the 
asymptotic expression for 0(1) is 


that is, 0(1) goes exponentially to zero. For large negative values of t, the function O (t) oscillates with 
decreasing amplitude according to the law: 


The Airy function is related to the MacDonald function (modified Hankel function) of order 1/3: 
0(1) = *Jt/3n K in (jt 312 ). 

Formula (2) corresponds to the asymptotic expansion of K v (t): 
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The function 0(0 attains its largest value, 0.949, for t = -1.02; correspondingly, the maximum intensity 
is reached at x(lk 2 /g)' 13 = - 1.02, where 


I = 2.03 Ak m g~ m . 

At the point where the ray is tangent to the caustic (x = 0), we have / = 0.89 Ak ln p l/6 [since 0(0) = 0.629], 
Thus near the caustic the intensity is proportional to k u \ that is, to A l/3 (A is the wavelength). For 
A ^ 0, the intensity goes to infinity, as it should (see § 54). 


§ 60. Fresnel diffraction 

If the light source and the point P at which we determine the intensity of the light are 
located at finite distances from the screen, then in determining the intensity at the point P, 
only those points are important which lie in a small region of the wave surface over which 
we integrate in (59.2)—the region which lies near the line joining the source and the point 
P. In fact, since the deviations from geometrical optics are small, the intensity of the light 
arriving at P from various points of the wave surface decreases very rapidly as we move 
away from this line. Diffraction phenomena in which only a small portion of the wave 
surface plays a role are called Fresnel diffraction phenomena. 

Let us consider the Fresnel diffraction by a screen. From what we have just said, for a 
given point P only a small region at the edge of the screen is important for this diffraction. 
But over sufficiently small regions, the edge of the screen can always be considered to be 
a straight line. We shall therefore, from now on, understand the edge of the screen to mean 
just such a small straight line segment. 

We choose as the XY plane a plane passing through the light source Q (Fig. 11) and 
through the line of the edge of the screen. Perpendicular to this, we choose the plane XZ so 
that it passes through the point Q and the point of observation P, at which we try to 
determine the light intensity. Finally, we choose the origin of coordinates O on the line of 
the edge of the screen, after which the positions of all three axes are completely determined. 



Let the distance from the light source Q to the origin be D q . We denote the x-coordinate 
of the point of observation P by D p , and its z-coordinate, i.e. its distance from the XY plane, 
by d. According to geometrical optics, the light should pass only through points lying above 
the AT plane, the region below the XY plane is the region which according to geometrical 
optics should be in shadow (region of geometrical shadow). 

We now determine the distribution of light intensity on the screen near the edge of the 
geometrical shadow, i.e. for values of d small compared with D p and D q . A negative d means 
that the point P is located within the geometrical shadow. 

As the surface of integration in (59.2) we choose the half-plane passing through the line 
of the edge of the screen and perpendicular to the AT plane. The coordinates x and y of points 
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on this surface are related by the equation x = y tan a (a is the angle between the line of the 
edge of the screen and the Y axis), and the z-coordinate is positive. The field of the wave 
produced by the source Q, at the distance R q from it, is proportional to the factor e q . 
Therefore the field u on the surface of integration is proportional to 

u ~ exp {ik^jy 2 + z 2 + (D q + y tan a) 2 }. 

In the integral (59.2) we must now substitute for R, 

R = y 2 + (z - d) 2 + (D p - y tan a) 2 . 

The slowly varying factors in the integrand are unimportant compared with the exponential. 
Therefore we may consider \IR constant, and write dy dz in place of df n . We then find that 
the field at the point P is 

u p ~ f f exp {ik^j(D q + y tan a ) 2 + y 2 + z 2 


+ ^[(Dp - y tan a) 2 +(z-d) 2 +y 2 )}dydz. (60.1) 

As we have already said, the light passing through the point P comes mainly from points 
of the plane of integration which are in the neighbourhood of O. Therefore in the integral 
(60.1) only values of y and z which are small (compared with D q and D p ) are important. For 
this reason we can write 

■yj(D q + y tan a) 2 + y 2 + z 2 - D q + y 2 ^ ~ + >' tan 

■J(D p - y tan a) 2 +(z- d) 2 + y 2 * D p + (Z ~ ^ — — - y tan a. 

We substitute this in (60.1). Since we are interested only in the field as a function of the 
distance d , the constant factor exp [ik(D p + D q )} can be omitted; the integral over y also 
gives an expression not containing d, so we omit it also. We then find 

“ P ~ J «p{»( z 2 + 2u; (z ~ df )}*• 

This expression can also be written in the form 



The light intensity is determined by the square of the field, that is, by the square modulus 
ln p l 2 . Therefore, when calculating the intensity, the factor standing in front of the integral is 


164 


THE PROPAGATION OF LIGHT 


§ 60 


irrelevant, since when multiplied by the complex conjugate expression it gives unity. An 
obvious substitution reduces the integral to 


where 



(60.3) 


w = 

Thus, the intensity I at the point P is : 


2D p (D q+ D p y 


'4 


#]/"'> 4{( c <- 2 > + t) 2 + ( s <- 2 »4) 2 }- 


(60.5) 


c (z) = '/!' J cos 1)2 d11 ' = ■M' J sin 

o ’ o 

are called the Fresnel integrals. Formula (60.5) solves our problem of determining the light 
intensity as a function of d. The quantity 7 0 is the intensity in the illuminated region at points 
not too near the edge of the shadow; more precisely, at those points with w » 1 (C(°°) = 
S(°°) = j- in the limit w —» °o). 

The region of geometrical shadow corresponds to negative w. It is easy to find the asymptotic 
form of the function I(w) for large negative values of w. To do this we proceed as follows. 
Integrating by parts, we have 



Integrating by parts once more on the right side of the equation and repeating this process, 
we obtain an expansion in powers of 1/lwl: 

| = -...]. (60.6) 

Although an infinite series of this type does not converge, nevertheless, because the sucessive 
terms decrease very rapidly for large values of Iwl, the first term already gives a good 
representation of the function on the left for sufficiently large Iwl (such a series is said to be 
asymptotic). Thus, for the intensity /(w), (60.5), we obtain the following asymptotic formula, 
valid for large negative values of w: 
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We see that in the region of geometric shadow, far from its edge, the intensity goes to zero 
as the inverse square of the distance from the edge of the shadow. 

We now consider positive values of w, that is, the region above the XY plane. We write 


j e* 2 dr] = J e'" 2 dr] - J dr] = (1 + 0-jf ~ J ^ <*!■ 

For sufficiently large w, we can use an asymptotic representation for the integral standing on 
the right side of the equation, and we have 




Substituting this expression in (60.5), we obtain 


/ = /o 1 + 



(60.8) 


(60.9) 


Thus in the ill uminate d region, far from the edge of the shadow, the intensity has an infinite 
sequence of maxima and minima, so that the ratio /// 0 oscillates on both sides of unity. With 
increasing w, the amplitude of these oscillations decreases inversely with the distance from 
the edge of the geometric shadow, and the positions of the maxima and minima steadily 
approach one another. 

For small w, the function I(w) has qualitatively this same character (Fig. 12). In the region 
of the geometric shadow, the intensity decreases monotonically as we move away from the 
boundary of the shadow. (On the boundary itself, /// 0 = i-) For positive w, the intensity has 
alternating maxima and minima. At the first (largest) maximum, I/Iq = 1.37. 



§ 61. Fraunhofer diffraction 

Of special interest for physical applications are those diffraction phenomena which occur 
when a plane parallel bundle of rays is incident on a screen. As a result of the diffraction, 
the beam ceases to be parallel, and there is light propagation along directions other than the 
initial one. Let us consider the problem of determining the distribution over direction of the 
intensity of the diffracted light at large distances beyond the screen (this formulation of the 
problem corresponds to Fraunhofer diffraction). Here we shall again restrict ourselves to 



THE PROPAGATION OF LIGHT 


166 


the case of small deviations from geometrical optics, i.e. we shall assume that the angles of 
deviation of the rays from the initial direction (the diffraction angles) are small. 

This problem can be solved by starting from the general formula (59.2) and passing to the 
limit where the light source and the point of observation are at infinite distances from the 
screen. A characteristic feature of the case we are considering is that, in the integral which 
determines the intensity of the diffracted light, the whole wave surface over which the 
integral is taken is important (in contrast to the case of Fresnel diffraction, where only the 
portions of the wave surface near the edge of the screens are important).f 

However, it is simpler to treat this problem anew, without recourse to the general formula 
(59.2). 

Let us denote by u 0 the field which would exist beyond the screens if geometrical optics 
were rigorously valid. This field is a plane wave, but its cross-section has certain regions 
(corresponding to the “shadows” of opaque screens) in which the field is zero. We denote by 
S the part of the plane cross-section on which the field u n is different from zero; since each 
such plane is a wave surface of the plane wave. u () = const over the whole surface S. 

Actually, however, a wave with a limited cross-sectional area cannot be strictly plane (see 
§ 58). In its spatial Fourier expansion there appear components with wave vectors having 
different directions, and this is precisely the origin of the diffraction. 

Let us expand the field into a two-dimensional Fourier integral with respect to the 
coordinates y, z in the plane of the transverse cross-section of the wave. For the Fourier 
components, we have: 

« q = ff M 0 e“' q r dy dz, (61.1) 


where the vectors q are constant vectors in the y, z plane; the integration actually extends 
only over that portion S of the y, z plane on which u () is different from zero. If k is the wave 
vector of the incident wave, the field component u q e' q r gives the wave vector k' = k + q. 
Thus the vector q = k' - k determines the change in the wave vector of the light in the 
diffraction. Since the absolute values k - k' - (ole, the small diffraction angles 6 y , 0 T in the 
xy- and xz-planes are related to the components of the vector q by the equations 


(61.2) 


For small deviations from geometrical optics, the components in the expansion of the field 
u () can be assumed to be identical with the components of the actual field of the diffracted 
light, so that formula (61.1) solves our problem. 


t The criteria for Fresnel and Fraunhofer diffraction are easily found by returning to formula (60.2) and 
applying it, for example, to a slit of width a ( instead of to the edge of an isolated screen). The integration 
over z in (60.2) should then be taken between the limits from 0 to a. Fresnel diffraction corresponds to the 
case when the term containing - 2 in the exponent of the integrand is important, and the upper limit of the 
integral can be replaced by «*>. For this to be the case, we must have 


On the other hand, if this inequality is reversed, the term in z 2 can be dropped; this corresponds to the case 
of Fraunhofer diffraction. 
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The intensity distribution of the diffracted light is given by the square \u q \ 2 as a function 
of the vector q. The quantitative connection with the intensity of the incident light is 
established by the formula 


j*j* u q dydz = JJ l« q 


dq y dq z 


(61.3) 


[compare (49.8)]. From this we see that the relative intensity diffracted into the solid angle 
do - dd y dO z is given by 


I mJi dq y dq - = f (Q V 2 (j 0 

" ul ( 2n ) 2 V 2nc J M o 


(61.4) 


Let us consider the Fraunhofer diffraction from two screens which are “complementary”: 
the first screen has holes where the second is opaque and conversely. We denote by u (l> and 
i/ 2 ’ the field of the light diffracted by these screens (when the same light is incident in both 
cases). Since u q (1) and u q (2) are expressed by integrals (61.1) taken over the surfaces of the 
apertures in the screens, and since the apertures in the two screens complement one another 
to give the whole plane, the sum w q (1) + w q <2) is the Fourier component of the field obtained 
in the absence of the screens, i.e. it is simply the incident light. But the incident light is a 
rigorously plane wave with definite direction of propagation, so that u q (1) + w q (2) = 0 for all 
nonzero values of q. Thus we have w q (1) = - u q (2) , or for the corresponding intensities, 

lw q (1) l 2 = lw q (2) l 2 for q * 0. (61.5) 

This means that complementary screens give the same distribution of intensity of the 
diffracted light (this is called Babinet’s principle). 

We call attention here to one interesting consequence of the Babinet principle. Let us 
consider a black body, i.e. one which absorbs completely all the light falling on it. According 
to geometrical optics, when such a body is illuminated, there is produced behind it a region 
of geometrical shadow, whose cross-sectional area is equal to the area of the body in the 
direction perpendicular to the direction of incidence of the light. However, the presence of 
diffraction causes the light passing by the body to be partially deflected from its initial 
direction. As a result, at large distances behind the body there will not be complete shadow 
but, in addition to the light propagating in the original direction, there will also be a certain 
amount of light propagating at small angles to the original direction. It is easy to determine 
the intensity of this scattered light. To do this, we point out that according to Babinet’s 
principle, the amount of light deviated because of diffraction by the body under consideration 
is equal to the amount of light which would be deviated by diffraction from an aperture cut 
in an opaque screen, the shape and size of the aperture being the same as that of the 
transverse section of the body. But in Fraunhofer diffraction from an aperture all the light 
passing through the aperture is deflected. From this it follows that the total amount of light 
scattered by a black body is equal to the amount of light falling on its surface and absorbed 
by it. 


PROBLEMS 

1. Calculate the Fraunhofer diffraction of a plane wave normally incident on an infinite slit (of width 2a) 
with parallel sides cut in an opaque screen. 
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Solution: We choose the plane of the slit as the yz plane, with the z axis along the slit (Fig. 13 shows a 
section of the screen). For normally incident light, the plane of the slit is one of the wave surfaces, and we 
choose it as the surface of integration in (61.1). Since the slit is infinitely long, the light is deflected only 
in the xy plane [since the integral (61.1)] becomes zero for q z *■ 0}. 



Therefore the field should be expanded only in the y coordinate: 

f 2 u 0 . 

“«=“oJ« qy dy = — sm W- 


The intensity of the diffracted light in the angular range d6 is 


dq I 0 sin 2 ka6 Jn 
2n ~ nak 6 1 


where k = talc, and I 0 is the total intensity of the light incident on the slit. 

dlldG as a function of diffraction angle has the form shown in Fig. 14. As 6 increases toward either side 
from 0=0, the intensity goes through a series of maxima with rapidly decreasing height. The successive 
maxima are separated by minima at the points 6=nn/ka (where n is an integer); at the minima, the intensity 
falls to zero. 



Fig. 14. 
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2 Calculate the Fraunhofer diffraction by a diffraction grating—a plane screen in which are cut a series 
of identical parallel slits (the width of the slits is 2 a, the width of opaque screen between neighbouring slits 
is 2b, and the number of slits is N). 

Solution: We choose the plane of the grating as the yz plane, with the z axis parallel to the slits. 
Diffraction occurs only in the xy plane, and integration of (61.1) gives: 

N-\ , .-2 iNqd 

.. _ ./ T e -2inqd ' - 

"« " U “ h e “1 - e-** 

single slit. Using the results of problem 1, 


where d = a + b, and u’ q is the result of the integration 

we get: 


dl = 


' sin Nqd ^ sin qa } 2 u _ h f sin NkGd ^ sm 2 ka6 


dq - 


Nnak\ sin k6d ) 6 2 


(/„ is the total intensity of the light passing through all the slits). c _ 

For the case of a large number of slits (N -> -), this formula can be written in another form. For values 
q = m/d, where n is an integer, dl/dq has a maximum; near such a maximum (i.e. for qd = nn + e, with £ 
small) 


(sin qa\ 2 sin 2 Ne 

di = i 0 a \—M >r r d q- 

l qa ) nNe 2 


But for N we have the formulaf 




sin 2 Me 
nNx 2 


= S(x). 


We therefore have, in the neighbourhood of each maximum: 

di = J a(f^tqa\ s ^ d£ 

dy qa J 

i.e., in the limit the widths of the maxima are infinitely narrow and the total light intensity in the n th 
maximum is 


j^_ Iv d sin Hnnald) 


3. Find the distribution of intensity over direction for the diffraction of light which is incident normal to 
the plane of a circular aperture of radius a. 

Solution ■ We introduce cylindrical coordinates z, r, * with the z axis passing through the centre of the 
aperture and perpendicular to its plane. It is obvious that tije diffraction is symmetric about the z ax.s 
that the vector q has only a radial component q r =q = k6. Measuring the angle (j) from the direction q, and 
integrating in (61.1) over the plane of the aperture, we find: 

t For a * 0 the function on the left side of the equation is zero, while according to a well-known formula 
of the theory of Fourier series. 


Urn -Ir J fix) - 


■dx =/(0). 


From this we see that the properties of this function actually coincide with those of the 5-function (see the 
footnote on p. 74). 
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u q = u 0 J j" e ‘«' cost> rd<j)dr = 2nu u J* J 0 (qr)rdr , 
where J v is the zero’th order Bessel function. Using the well-known formula 

J Jt)(qr)rdr = ( aq ), 

we then have 

u q =2n^J x (aq), 

and according to (61.4) we obtain for the intensity of the light diffracted into the element of solid angle do: 


dl = I 0 


J*(akO) 

tc6 2 


■do. 


where I 0 is the total intensity of the light incident on the aperture. 
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§ 62. The retarded potentials 

In Chapter 5 we studied the constant field, produced by charges at rest, and in Chapter 6, 
the variable field in the absence of charges. Now we take up the study of varying fields in 
the presence of arbitrarily moving charges. 

We derive equations determining the potentials for arbitrarily moving charges. This derivation 
is most conveniently done in four-dimensional form, repeating the derivation at the end of 
§ 46, with the one change that we use the second pair of Maxwell equations in the form 
(30.2) 


dF ik 4n 

dx k c J • 

The same right-hand side also appears in (46.8), and after imposing the Lorentz condition 


dA' 

dx‘ 


0 , 


— ^ + div A = 0, 
c dt 


on the potentials, we get 


d 2 A i _ 4jr ., 
t)x k ,lx k <• J ‘ 

This is the equation which determines the potentials of an arbitrary elecuoinag 
In three-dimensional form it is written as two equations, for A and for 0: 


AA —y = 

c 2 dt 2 


(62.1) 


(62.2) 
l- Held. 


(62.3) 


„ t 1 d 2 <j) 


(62.4) 


For constant fields, these reduce to the already familiar equations (36.4) and (43.4), and for 
variable fields without charges, to the homogeneous wave equation. 

As we know, the solution of the inhomogeneous linear equations (62.3) and (62.4) can be 
represented as the sum of the solution of these equations without the right-hand side, and a 
particular integral of these equations with the right-hand side. To find the particular solution, 
we divide the whole space into infinitely small regions and determine the field produced by 
171 
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the charges located in one of these volume elements. Because of the linearity of the field 
equations, the actual field will be the sum of the fields produced by all such elements. 

The charge de in a given volume element is, generally speaking, a function of the time. If 
we choose the origin of coordinates in the volume element under consideration, then the 
charge density is Q = de(t) <5(R), where R is the distance from the origin. Thus we must 
solve the equation 

A * - 4- 4-r = - 4 7 Ide(t) S( R). (62.5) 

c 2 dr 

Everywhere, except at the origin, 0(R) = 0, and we have the equation 

A*-^0 = 0. (62.6) 

It is clear that in the case we are considering 0 has central symmetry, i.e. 0 is a function only 
of R. Therefore if we write the Laplace operator in spherical coordinates, (62.6) reduces to 



To solve this equation, we make the substitution 0 = %(R, t)/R. Then, we find for X 

1 d 2 X n 

dR 2 c 2 dt 2 


But this is the equation of plane waves, whose solution has the form (see § 47): 


Since we only want a particular solution of the equation, it is sufficient to choose only one 
of the functions/, and f 2 . Usually it turns out to be convenient to take/ 2 = 0 (concerning this, 
see below). Then, everywhere except at the origin, 0 has the form 



(62.7) 


So far the function x is arbitrary; we now choose it so that we also obtain the correct value 
for the potential at the origin. In other words, we must select x so that at the origin equation 
(62.5) is satisfied. This is easily done noting that as R -» 0, the potential increases to infinity, 
and therefore its derivatives with respect to the coordinates increase more rapidly than its 
time derivative. Consequently as R —> 0, we can, in equation (62.5), neglect the term (1/c 2 )/ 
(d 2 (j)ldt 2 ) compared with A0. Then (62.5) goes over into the familiar equation (36.9) leading 
to the Coulomb law. Thus, near the origin, (62.7) must go over into the Coulomb law, from 
which it follows that x(t) = de(t), that is. 
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From this it is easy to get to the solution of equation (62.4) for an arbitrary distribution of 
charges Q(x, y, z, t). To do this, it is sufficient to write de = QdV (dV is the volume element) 
and integrate over the whole space. To this solution of the inhomogeneous equation (62.4) 
we can still add the solution 0 O of the same equation without the right-hand side. Thus, the 
general solution has the form: 


<Kr, t) = J e (r', t - * j dV' + to, (62.8) 

R = r - r', dV = dx' dy' dz' 


where 


r = (x, y, z), r' = (*', /, z'); 


R is the distance from the volume element dV to the “field point” at which we determine the 
potential. We shall write this expression briefly as 


0= J ^p-dV + <!> 0 , (62.9) 

where the subscript means that the quantity Q is to be taken at the time t - ( R/c ), and the 
prime on dV has been omitted. 

Similarly we have for the vector potential: 


A = c J ^ fldv+A o, (62-10) 

where A 0 is the solution of equation (62.3) without the right-hand term. 

The potentials (62.9) and (62.10) (without 0 O and A o) are called the retarded potentials. 

In case the charges are at rest (i.e. density p independent of the time), formula (62.9) goes 
over into the well-known formula (36.8) for the electrostatic field; for the case of stationary 
motion of the charges, formula (62.10), after averaging, goes over into formula (43.5) for 
the vector potential of a constant magnetic field. 

The quantities A 0 and (j) 0 in (62.9) and (62.10) are to be determined so that the conditions 
of the problem are fulfilled. To do this it is clearly sufficient to impose initial conditions, that 
is, to fix the values of the field at the initial time. However we do not usually have to deal 
with such initial conditions. Instead we are usually given conditions at large distances form 
the system of charges throughout all of time. Thus, we may be told that radiation is incident 
on the system from outside. Corresponding to this, the field which is developed as a result 
of the interaction of this radiation with the system can differ from the external field only by 
the radiation originating from the system. This radiation emitted by the system must, at large 
distances, have the form of waves spreading out from the system, that is, in the direction of 
increasing R. But precisely this condition is satisfied by the retarded potentials. Thus these 
solutions represent the field produced by the system, while 0 O and A 0 must be set equal to 
the external field acting on the system. 


§ 63. The Lienard-Wiechert potentials 

Let us determine the potentials for the field produced by a charge carrying out an assigned 
motion along a trajectory r = r 0 (r). 
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According to the formulas for the retarded potentials, the field at the point of observation 
P(x, y, z ) at time t is determined by the state of motion of the charge at the earlier time t\ 
for which the time of propagation of the light signal from the point r 0 (f')> where the charge 
was located, to the field point P just coincides with the difference t - t'. Let R(f) = r - r 0 (/) 
be the radius vector from the charge e to the point P; like r 0 (r) it is a given function of the 
time. Then the time t' is determined by the equation 


R(t') _ 


(63.1) 


For each value of t this equation has just one root t'.f 
In the system of reference in which the particle is at rest at time t', the potential at the 
point of observation at time t is just the Coulomb potential. 


' R{t') ’ 


(63.2) 


The expressions for the potentials in an arbitrary reference system can be found directly 
by finding a four-vector which for v = 0 coincides with the expressions just given for 0 and 
A. Noting that, according to (63.1), 0 in (63.2) can also be written in the form 


we find that the required four-vector is: 


A‘ = 


e 


u l 

R k u k ’ 


(63.3) 


where u k is the four-velocity of the charge, R k = [c(t-t'), r - r'], where x', y, z', t' are related 
by the equation (63.1), which in four-dimensional form is 

R k R k = 0. ' (63.4) 


Now once more transforming to three-dimensional notation, we obtain, for the potentials of 
the field produced by an arbitrarily moving point charge, the following expressions: 



(63.5) 


where R is the radius vector, taken from the point where the charge is located to the point 
of observation P, and all the quantities on the right sides of the equations must be evaluated 
at the time t', determined from (63.1). The potentials of the field, in the form (63.5), are 
called the Lienard-Wiechert potentials. 


t This point is obvious but it can be verified directly. To do this we choose the field point P and the time 
of observation t as the origin O of the four-dimensional coordinate system and construct the light cone 
(§ 2) with its vertex at O. The lower half of the cone, containing the absolute past (with respect to the event 
O), is the geometrical locus of world points such that signals sent from them reach O. The points in which 
this hypersurface intersects the world line of the charge are precisely the roots of (63.1). But since the 
velocity of a particle is always less than the velocity of light, the inclination of its world line relative to the 
time axis is everywhere less than the slope of the light cone. It then follows that the world line of the particle 
can intersect the lower half of the light cone in only one point. 
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E = - — - grad 0, H = curl A, 

we must differentiate 0 and A with respect to the coordinates x, y, z of the point, and the time 
t of observation. But the formulas (63.5) express the potentials as functions of t', and only 
through the relation (63.1) as implicit functions of x, y, z, t. Therefore to calculate the 
required derivatives we must first calculate the derivatives of t'. Differentiating the relation 
R(t') = c(t - t') with respect to t, we get 


dR _ dR dt' _ R • v dt' _ ( dt' \ 

~dt ~ W ~di ~ ~ R~ ~dt~ C { ~dT)' 

(The value of dR/dt' is obtained by differentiating the identity R 2 = R 2 and substituting 
dR(t')/dt' - - \(t'). The minus sign is present because R is the radius vector from the charge 
e to the point P, and not the reverse.) 

Thus, 


dt’ 1 
dt i v • R 
Rc 


(63.6) 


Similarly differentiating the same relation with respect to the coordinates, we find 


grad t'=-^ grad R(t') = - i grad t’ + ^ j. 


so that 



(63.7) 


With the aid of these formulas, there is no difficulty in carrying out the calculation of the 
fields E and H. Omitting the intermediate calculations, we give the final results: 




R--/? xv , (63.8) 


H = |rxE. (63.9) 

Here, v = d\/dt all quantities on the right sides of the equations refer to the time t'. It is 
interesting to note that the magnetic field turns out to be everywhere perpendicular to the 
electric. 

The electric field (63.8) consists of two parts of different type. The first term depends only 
on the velocity of the particle (and not on its acceleration) and varies at large distances like 
1/R 2 . The second term depends on the acceleration, and for large R it varies like l/R. Later 
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(§ 66) we shall see that this latter term is related to the electromagnetic waves radiated by 
the particle. 

As for the first term, since it is independent of the acceleration it must correspond to the 
field produced by a uniformly moving charge. In fact, for constant velocity the difference 


R ( - ~^R,' = R»' 

is the distance R f from the charge to the point of observation at precisely the moment of 
observation. It is also easy to show directly that 


R, - I R, v = ^ R? - -L ( V x R,) 2 = R, J 1 -^-sin 2 0„ 

where 6, is the angle between R, and v. Consequently the first term in (63.8) is identical with 
the expression (38.8). 


PROBLEM 

Derive the Lienard-Wiechert potentials by integrating (62.9)-(62.10). 

Solution: We write formula (62.8) in the form: 

0(r, t)= JJ S^r-t + ^ lr - r'lj drdV' 

(and similarly for A(r, /)), introducing the additional delta function and thus eliminating the implicit 
arguments in the function Q. For a point charge, moving in a trajectory r = r 0 (t); we have: 

e(r', r) = e<5[r' - r 0 (r)]. 

Substituting this expression and integrating over dV', we get: 

The T integration is done using the formula 

[where t is the root of F(t') = 0], and gives formula (63.5). 

§ 64. Spectral resolution of the retarded potentials 

The field produced by moving charges can be expanded into monochromatic waves. The 
potentials of the different monochromatic components of the field have the form 
A 6) «"" a . The charge and current densities of the system of charges producing the field can 
also be expanded in a Fourier series or integral. It is clear that each Fourier component of 
Q and j is responsible for the creation of the corresponding monochromatic component of 
the field. 

In order to express the Fourier components of the field in terms of the Fourier components 
of the charge density and current, we substitute in (62.9) for 0 and Q respectively, ^ a e~ ict *, 
and We then obtain 



§ 64 SPECTRAL RESOLUTION OF THE RETARDED POTENTIALS 177 

go> g i— dv - 

Factoring and introducing the absolute value of the wave vector k = (ole, we have: 

fc=J e.^dV. (64.1) 

Similarly, for A a we get 

<642 > 

We note that formula (64.1) represents a generalization of the solution of the Poisson 
equation to a more general equation of the form 

A 0 (0 + k 2 ^ = - 4nQ m (64.3) 

(obtained from equations (62.4) for Q, <p depending on the time through the factor 
If we were dealing with expansion into a Fourier integral, then the Fourier components of 
the charge density would be 

Q m = J Qe it0, dt. 

Substituting this expression in (64.1), we get 

4 = JJ ^ e i<tot+kR) dVdt. (64.4) 


We must still go over from the continuous distribution of charge density to the point charges 
whose motion we are actually considering. Thus, if there is just one point charge, we set 

Q = e5[r - r 0 (OL 

where r 0 (t) is the radius vector of the charge, and is a given function of the time. Substituting 
this expression in (64.4) and carrying out the space integration [which reduces to replacing 
r by r 0 (t)], we get: 



(64.5) 


where now R(t) is the distance from the moving particle to the point of observation. Similarly 
we find for the vector potential: 


where v = r 0 (r) is the velocity of the particle. 
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Formulas analogous to (64.5), (64.6) can also be written for the case where the spectral 
resolution of the charge and current densities contains a discrete series of frequencies. Thus, 
for a periodic motion of a point charge (with period T = 2nlo\^ the spectral resolution of the 
field contains only frequencies of the form nco , 0 , and the corresponding components of the 
vector potential are 


T 

A„ = ^J W) einC °° l,+R(,Vc] dt (64.7) 

o 

(and similarly for 0„). In both (64.6) and (64.7) the Fourier components are defined in 
accordance with § 49. 


PROBLEM 


Find the expansion in plane waves of the field of a charge in uniform rectilinear motion. 

Solution: We proceed in similar fashion to that used in § 51. We write the charge density in the form 
Q = eSir - vr), where v is the velocity of the particle. Taking Fourier components of the equation <j> = 
- 4 nc Sir - V7), we find ( 0) k = - 4 ne c ' lv kl '. 

On the other hand, from 


-J 




d*k 

(2 n) 3 


we have 


Thus, 


from which, finally 


(Fl0 k =-k 2 <l> k ^ T ^-. 

JL + k 2 <p k = 4me- iik v) ', 


= Aik 



From this it follows that the wave with wave vector k has the frequency (0 = k • v. Similarly, we obtain 
for the vector potential. 



Finally, we have for the fields. 


k + (k ;\ 



e i(kv)» 
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§ 65. The Lagrangian to terms of second order 

In ordinary classical mechanics, we can describe a system of particles interacting with 
each other with the aid of a Lagrangian which depends only on the coordinates and velocities 
of these particles (at one and the same time). The possibility of doing this is, in the last 
analysis, dependent on the fact that in mechanics the velocity of propagation of interactions 
is assumed to be infinite. 

We already know that because of the finite velocity of propagation, the field must be 
considered as an independent system with its own “degrees of freedom”. From this it 
follows that if we have a system of interacting particles (charges), then to describe it we 
must consider the system consisting of these particles and the field. Therefore, when we take 
into account the finite velocity of propagation of interactions, it is impossible to describe the 
system of interacting particles rigorously with the aid of a Lagrangian, depending only on 
the coordinates and velocities of the particles and containing no quantities related to the 
internal “degrees of freedom” of the field. 

However, if the velocity vof all the particles is small compared with the velocity of light, 
then the system can be described by a certain approximate Lagrangian. It turns out to be 
possible to introduce a Lagrangian describing the system, not only when all powers of vie 
are neglected (classical Lagrangian), but also to terms of second order, v^/c 2 . This last 
remark is related to the fact that the radiation of electromagnetic waves by moving charges 
(and consequently, the appearance of a “self’-field) occurs only in the third approximation 
in vie (see later, in § 67).t 

As a preliminary, we note that in zero’th approximation, that is, when we completely 
neglect the retardation of the potentials, the Lagrangian for a system of charges has the form 

L'°)=l|/n 0 v 2 -I^ (65.1) 

a 2 a>b Kab 

(the summation extends over the charges which make up the system). The second term is the 
potential energy of interaction as it would be for charges at rest. 

To get the next approximation, we proceed in the following fashion. The Lagrangian for 
a charge e a in an external field is 



Choosing any one of the charges of the system, we determine the potentials of the field 
produced by all the other charges at the position of the first, and express them in terms of 
the coordinates and velocities of the charges which produce this field (this can be done only 
approximately—for 0, to terms of order v^lc 2 , and for A, to terms in vie). Substituting the 
expressions for the potentials obtained in this way in (65.2), we get the Lagrangian for one 

t For systems consisting of particles with the same chaige-to-mass ratio, the appearance of radiation is 
put off to the fifth approximation in vie, in such a case there is a Lagrangian to terms of fourth order in 
vie. [See B.M. Barker and R.F. O’Connel, Can. J. Phys. 58, 1659 (1980).] 
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of the charges of the system (for a given motion of the other charges). From this, one can 
then easily find the Lagrangian for the whole system. 

We start from the expressions for the retarded potentials 


a f Qt-R/c ... i if «h R/c , 

*=J A = cj -ir d 


If the velocities of all the charges are small compared with the velocity of light, then the 
charge distribution does not change significantly during the time R/c. Therefore we can 
expand p t - R / c and j,_ R/c in series of powers of R/c. For the scalar potential we thus find, to 
terms of second order: 


0=1 tMIJ Redv 

(Q without indices is the value of Q at time t; the time differentiations can clearly be taken 
out from under the integral sign). But J QdV is the constant total charge of the system. 
Therefore the second term in our expression is zero, so that 

Redv <653) 

We can proceed similarly with A. But the expression for the vector potential in terms of 
the current density already contains 1/c, and when substituted in the Lagrangian is multiplied 
once more by 1/c. Since we are looking for a Lagrangian which is correct only to terms of 
second order, we can limit ourselves to the first term in the expansion of A, that is, 

A= c J % dV (654) 

(we have substituted j = Qv). 

Let us first assume that there is only a single point charge e. Then we obtain from (65.3) 
and (65.4), 


e d 2 R 
2c 2 dt 2 ’ 


A 


e\ 


(65.5) 


where R is the distance from the charge. 

We choose in place of 0 and A other potentials (j)' and A', making the transformation (see 
§ 18): 

A ' = A + grad/, 

in which we choose for / the function 

f- JL — 

J ~ 2c dt ' 


Then we getf 


t These potentials no longer satisfy the Lorentz condition (62.1), nor the equations (62.3)-(62.4). 
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r= 


± A' - — + — V 
R’ cR 2c dt' 


To calculate A' we note first of all that V(dR/dt) = ( d/dt)VR. The grad operator here means 
differentiation with respect to the coordinates of the field point at which we seek the value 
of A'. Therefore VR is the unit vector n directed from the charge e to the field point, so that 


We also write: 


A _ A. ( *1 _ R _ M 

dt(Rj R r2- 

But the derivative -R for a given field point is the velocity v of the charge, and the 
derivative R is easily determined by differentiating R 2 = R 2 , that is, by writing 

RR = R • R = - R • v. 


Thus, 


- v + n(n • v) 
n =--- 

Substituting this in the expression for A', we get finally: 




. , e[v + (v • n)n] 
A =-2 cR - 


(65.6) 


If there are several charges then we must, clearly, sum these expressions over all the charges. 

Substituting these expressions in (65.2), we obtain the Lagrangian L a for the charge e a (for 
a fixed motion of the other charges). In doing this we must also expand the first term in 
(65.2) in powers of vjc, retaining terms up to the second order. Thus we find: 


L a 


m a i m y a 

2 + 8 c 2 


£ 

b 


e b 

Rab 


£' [\ a ■ \ b + (v fl • n a6 )(v fc • n ob )] 


(the summation goes over all the charges except e a ; n ab is the unit vector from e b to e a ). 

From this, it is no longer difficult to get the Lagrangian for the whole system. It is easy 
to convince oneself that this function is not the sum of the L a for all the charges, but has the 
form 


- [v a • v b + (v a ■ n a6 )(v fc • n flfc )]. 


(65.7) 


Actually, for each of the charges under a given motion of all the others, this function L goes 
over into L a as given above. The expression (65.7) determines the Lagrangian of a system 
of charges correctly to terms of second order. (It was first obtained by C. G. Darwin, 1922.) 

Finally we find the Hamiltonian of a system of charges in this same approximation. This 
could be done by the general rule for calculating ^ from L; however it is simpler to proceed 
as follows. The second and fourth terms in (65.7) are small corrections to L (0) (65.1). On the 
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other hand, we know from mechanics that for small changes of L and the additions to 
them are equal in magnitude and opposite in sign (here the variations of L are considered for 
constant coordinates and velocities, while the changes in i# refer to constant coordinates and 
momenta).! 

Therefore we can at once write §£, subtracting from 


the second and fourth terms of (65.7), replacing the velocities in them by the first approximation 
\ a = pJ m a- Thus, 


r=£- 


--E Pa 

i a 8 c 2 ml 


a>b 


- £ 


a>b 2c" 


i£b _ 


[Pa ‘ Pfc + (Pa ‘ n afc)(Pfc ' n afc)l- 


(65.8) 


PROBLEMS 

1. Determine (correctly to terms of second order) the centre of inertia of a system of interacting particles. 
Solution: The problem is solved most simply by using the formula 

I,d a r a +IWrdV 

R = ^--- 

I.# a +IWdV 

[see (14.6)], where S a is the kinetic energy of the particle (including its rest energy), and W is the energy 
density of the field produced by the particles. Since the contain the large quantities m a c 2 , it is sufficient, 
in obtaining the next approximation, to consider only those terms in 4 and W which do not contain c, i.e. 
we need consider only the nonrelativistic kinetic energy of the particles and the energy of the electrostatic 
field. We then have: 

J WrdV -it;j £2rdv 

-sfJ fVv)2rdv 

-£/(■"■^ 

the integral over the infinitely distant surface vanishes; the second integral also is transformed into a surface 
integral and vanishes, while we substitute Acp = - 4nQ in the third integral and obtain: 

J WrdV=^J p<prdV= j le a tp a r a> 


t See Mechanics, § 40. 
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where (p a is the potential produced at the point r a by all the charges other than e a .f 
Finally, we get: 



(with a summation over all b except b = a), where 



is the total energy of the system. Thus in this approximation the coordinates of the centre of inertia can 
actually be expressed in terms of quantities referring only to the particles. 

2. Write the Hamiltonian in second approximation for a system of two particles, omitting the motion of 
the system as a whole. 

Solution: We choose a system of reference in which the total momentum of the two particles is zero. 
Expressing the momenta as derivatives of the action, we have 

Pi + p 2 = dS/d r! + dS/d r 2 = 0. 

From this it is clear that in the reference system chosen the action is a function of r = r 2 - r,, the difference 
of the radius vectors of the two particles. Therefore we have p 2 = - Pi = P, where p = dS/d r is the 
momentum of the relative motion of the particles. The Hamiltonian is 


t The elimination of the self-field of the particles corresponds to the mass “renormalization” mentioned 
in the footnote on p. 97). 



